This is the latest entry in our ‘State of the art-ful AI’ series, where we upload a photo to Chat GPT 4 and ask it for a photo description, then ask DALL-E 3 to generate an image using the description. How similar will they be?
Why this workflow?
Less than a year after its launch, ChatGPT and DALL-E have already seen multiple iterations. In October 2023, ChatGPT 4 gained the capability to describe uploaded photos. To track the progress of generative AI, we uploaded these AI descriptions to the Dall-E 3 image generation model to see how faithful its output is to the original photo. A variety of photos have been selected which we’ll bring you throughout November and December.
Our original photo
This image was taken on a local patio. There is a thin coating of frost across the scene.
The photo as described by Chat GPT 4 in 150 words
This image features a serene garden setting with various elements that together create a peaceful ambiance. In the center, there is a gray stone statue of Buddha seated in a meditative pose with a gentle expression, hands in his lap, holding a small bowl.
Surrounding the Buddha are a variety of potted plants; to the left, a terracotta pot with ornate detailing stands empty, while other pots contain plants with silvery leaves that have frost-like edges, hinting at cold weather. There’s a diversity of foliage textures and colors ranging from deep greens to frosty whites, with a large, rust-colored pot in the background contributing a warm earthy tone. The presence of fallen leaves and the dormant state of some plants suggest it’s either late fall or winter. The setting is completed by a miniature fence, adding a quaint touch to the composition. The overall effect is one of tranquility and natural beauty.
Photos generated by Dall-E 3 using the AI description
Spot the difference
Using a separate chat, we asked Chat GPT what differences it finds between the original photo and the FIRST image generated above:
- The first photo is a real-life image with natural lighting, while the second one appears to be a digital rendering or an image with enhanced saturation and contrast.
- In the first photo, the plants and leaves are coated with frost, indicating it was taken in cold weather, whereas the second photo shows the plants in a healthy, unfrosted state, suggesting a warmer or different season.
- The Buddha statue in the first photo holds a bowl containing a few leaves, possibly affected by the cold as they look wilted. In contrast, the bowl in the second photo contains what appears to be a round object, possibly a seed or stone.
- The surrounding plants in the first photo are sparse and show signs of weathering or seasonal decay, while the plants in the second photo are lush and vibrant.
- There is a visible difference in the variety of plants; the first photo has fewer species, and the second photo includes a wider variety, such as large succulents and fuller grasses or shrubs.
- The first photo has a muted color palette due to the frost and lighting, whereas the second photo is very vibrant with rich greens and other colors.
- The background in the first photo shows more pots and a blue container, which are not present in the second photo. Instead, the second photo features a wooden fence and additional greenery.
- The second photo has additional elements like a Buddha figure in the background and more ornate pots, which are absent in the first photo.
- The overall composition and angle of the two photos are similar, but the second photo seems to have a tighter frame focusing more on the Buddha statue and the plants immediately surrounding it.
- The textures in the first photo show the natural roughness and frostbite on the plants and surfaces, while the second photo’s textures are smoother and have a more polished appearance.
Example workflow video
Here’s our workflow in action, using an ice cream truck image. This is not an explainer video – It shows the main steps of the above ChatGPT 4 workflow in November 2023.
New to pixels.cool?
Have a look at our year planner RIGHT HERE!