Which AI Image Tool Generates the Most Realistic Images?
Side-by-side comparisons of today's top tools.
👋 Hey, I’m Casandra. I share really good business ideas to help you start and grow a business. Become a Premium subscriber to access the full archive and Premium Perks like my direct support.
The world is seeing a massive influx of AI-generated images, most of which sit in the uncanny valley. They look…off.
The uncanny valley describes the phenomenon where human-like objects elicit feelings of unease or revulsion as they become very close to, but not perfectly, resembling actual humans.

But the technology for creating realistic, photography-style images has improved rapidly over the last year.
I put the top tools—like Gemini, Midjourney, and ChatGPT—through a creative obstacle course of real-world prompts: photorealistic portraits, images with text, modern product photography, city street scenes, fashion editorials, and even iconic natural landscapes. The results? Some were stunning. Others… not so much.
This side-by-side comparison reveals what these tools really get right—and where they still fall apart. Plus, I’ll share my choice for best overall product at the end!
AI Image Generation Tools Overview
I decided to compare the three leading image generation tools, plus one under-the-radar tool that you might not know about.
Gemini Inagen 3
Google’s LLM, Gemini, got into the image generation game somewhat late. It first offered its image generation model, Imagen 3, in February 2024 but quickly limited it due to technical issues.1 However, since fully rolling it back out, it’s quickly become one of the better options on the market, especially when you consider it’s free to use.
Editing: No editing features. New images can be generated through prompt refinement.
Price: Free
Midjourney Version 6.1
Midjourney has been a leading AI image generation tool for several years, with its latest model, Version 6.1, being released in July 2024. It can produce realistic results with good prompting, but it really excels with highly artistic and surreal or fantastical images. Note: Substack generates four images to choose from for each prompt.
Editing: Many editing features are available to get images just right.
Price: Starts a $10/month
ChatGPT-4o
ChatGPT recently released a new image generation model as part of its 4o model (replacing Dall-E 3), which has been getting a lot of press for its quality.2 While it was initially meant to roll out to all users, demand has been so high that they have had to limit it to paid users for the time being.3
Editing: Images can be edited through prompting.
Price: $20/month for ChatGPT Plus
Substack Image Generation
Many don’t realize that you can generate images directly in Substack. While it’s not exactly touted as a leading image generation model, it is incredibly convenient for publishers to use, so I’ve included it in the comparisons. Note: Substack generates four images to choose from for each prompt.
Editing: No editing features. New images can be generated through prompt refinement.
Price: Free if you have a Substack publication.
Challenge #1: Realistic Human Portrait
🪄 Prompt: A close-up portrait of a woman in natural light, freckles, soft-focus background, photorealistic, 35mm lens, shallow depth of field.
🧠 Tests: Human features, realism, skin texture, lighting, and eye rendering.
👀 Results: ChatGPT looks the most realistic. The Gemini image and some of the Midjourney images looks close but a bit too smooth. The Substack images definitely look rendered.
Gemini Imagen 3
Midjourney Version 6.1
ChatGPT‑4o
Substack Image Generation
Choose the Winner
Challenge #2: Images With Text
🪄 Prompt: A vintage book cover with the title ‘The Electric Forest’, stylized type, floral borders, aged paper texture, Art Nouveau style.
🧠 Tests: Ability to render actual legible and stylistic text.
👀 Results: ChatGPT was the only tool to render the title correctly on a realistic-looking book cover. Midjourney couldn’t handle the text at all, but the floral borders are quite nice. Substack handled the text quite well, but it’s not really a realistic-looking book cover.
Gemini Imagen 3
Midjourney Version 6.1
ChatGPT‑4o
Substack Image Generation
Choose the Winner
Challenge #3: Fashion Editorial
🪄 Prompt: A high-fashion editorial photo of a model in an avant-garde pink lace gown, standing on a sailboat at sunset, cinematic lighting, Vogue-style.
🧠 Tests: Fabric rendering, fine details, hands, composition, aesthetics.
👀 Results: This one is a bit subjective. The ChatGPT and Gemini images look the most like realistic, highly Photoshopped fashion editorials, but (IMHO) the Substack dresses are much more stylish.
Gemini Imagen 3
Midjourney Version 6.1
ChatGPT‑4o
Substack Image Generation
Choose the Winner
Challenge #4: Product Shot
🪄 Prompt: Product shot of a cappuccino, bright solid color background, bright lighting similar to contemporary direct to consumer brands.
🧠 Tests: Cleanliness, shadow quality, product geometry, photorealism.
👀 Results: The ChatGPT image looks like what you would see on a modern product page with simple latte art and a crisp, clean background. The Gemini image is missing expected latte art, and the Midjourney and Substack images just look kooky.
Gemini Imagen 3
Midjourney Version 6.1
ChatGPT‑4o
Substack Image Generation
Choose the Winner
Challenge #5: Street Scene
🪄 Prompt: A rainy Tokyo street at night, neon signs, reflections in puddles, people with umbrellas, cinematic atmosphere, cyberpunk style.
🧠 Tests: Reflections, color grading, urban realism, crowd rendering.
👀 Results: The Gemini image created the right vibe without any obvious issues. The “P” on the Panasonic sign looks strange in the ChatGPT image, and the unexpected symmetry of the people with umbrellas also creates a menacing vibe. The Midjourney and Substack images look cartoonish. PS. If anyone can read Japanese, I’d love to know how accurate the signs are!
Gemini Imagen 3
Midjourney Version 6.1
ChatGPT‑4o
Substack Image Generation
Choose the Winner
Challenge #6: Complex Object Interaction
🪄 Prompt: A child holding a glass orb with a tiny galaxy inside, light reflections on the orb, accurate hand anatomy, shallow depth of field.
🧠 Tests: Hand-object interaction, transparency, reflections, small-scale realism.
👀 Results: Gemini handles the hand detail and the interaction between the hand and the orb well. The orbs in all the images look like they’re floating rather than being held, but the orb’s transparency is handled quite well in Midjourney, and the hand detail from ChatGPT is excellent. Although I had to change ”child” to "person” for ChatGPT to create the image, the hand it generated still looks young.
Gemini Imagen 3
Midjourney Version 6.1
ChatGPT‑4o
Substack
Challenge #7: Natural Landscape
🪄 Prompt: Yosemite Valley with El Capitan and Half Dome visible in the distance, early morning fog, golden sunrise light casting long shadows, realistic National Geographic-style photo.
🧠 Tests: Landmark accuracy, depth, lighting, composition.
👀 Results: Besides Substack, these are all quite nice, but only the ChatGPT image looks realistic to me. Although I love the wildflower detail in the Gemini image, it and the Midjourney images look computer-generated.
Gemini Imagen 3
Midjourney Version 6.1
ChatGPT‑4o
Substack Image Generation
Final Verdict
Overall Winner: ChatGPT-4o is the clear winner, if you’re willing to pay.
ChatGPT clearly came out on top. Besides the strange Panasonic sign in the street scene, every image it generated was quite good and directly addressed the prompt.
This test has convinced me to switch from Midjourney to ChatGPT-4o, at least as far as paid tools go. I do like Midjourney’s many options for refining and editing images, but that process can also be a big timesuck. I can still edit through prompting with ChatGPT, but more importantly, I found that the images just don’t need as much refinement as with the other tools.
Free Tool Winner: Go with Gemini if you don’t want to pay.
Gemini's images were consistently better than both Midjourney and Substack. The major downside is that it’s impossible to edit besides refining your prompt and generating a brand new image each time.
I’d love to hear about your experience with AI image generation! Do you have a preferred tool? Are there any tips and tricks you’re willing to share? Did you find any more flaws in the photos I presented that I missed?
To endless possibilities,
Casandra
My own cameras makes the most realisitic images I have ever seen. 😂