State of the art-ful AI: A touch of frost

From AI to AI and back again

This is the latest entry in our ‘State of the art-ful AI’ series, where we upload a photo to Chat GPT 4 and ask it for a photo description, then ask DALL-E 3 to generate an image using the description. How similar will they be?

Why this workflow?

Less than a year after its launch, ChatGPT and DALL-E have already seen multiple iterations. In October 2023, ChatGPT 4 gained the capability to describe uploaded photos. To track the progress of generative AI, we uploaded these AI descriptions to the Dall-E 3 image generation model to see how faithful its output is to the original photo. A variety of photos have been selected which we’ll bring you throughout November and December.

Our original photo

This image was taken on a local patio. There is a thin coating of frost across the scene.

The photo as described by Chat GPT 4 in 150 words

This image features a serene garden setting with various elements that together create a peaceful ambiance. In the center, there is a gray stone statue of Buddha seated in a meditative pose with a gentle expression, hands in his lap, holding a small bowl.

Surrounding the Buddha are a variety of potted plants; to the left, a terracotta pot with ornate detailing stands empty, while other pots contain plants with silvery leaves that have frost-like edges, hinting at cold weather. There’s a diversity of foliage textures and colors ranging from deep greens to frosty whites, with a large, rust-colored pot in the background contributing a warm earthy tone. The presence of fallen leaves and the dormant state of some plants suggest it’s either late fall or winter. The setting is completed by a miniature fence, adding a quaint touch to the composition. The overall effect is one of tranquility and natural beauty.

Photos generated by Dall-E 3 using the AI description

Spot the difference

Using a separate chat, we asked Chat GPT what differences it finds between the original photo and the FIRST image generated above:

  1. The first photo is a real-life image with natural lighting, while the second one appears to be a digital rendering or an image with enhanced saturation and contrast.
  2. In the first photo, the plants and leaves are coated with frost, indicating it was taken in cold weather, whereas the second photo shows the plants in a healthy, unfrosted state, suggesting a warmer or different season.
  3. The Buddha statue in the first photo holds a bowl containing a few leaves, possibly affected by the cold as they look wilted. In contrast, the bowl in the second photo contains what appears to be a round object, possibly a seed or stone.
  4. The surrounding plants in the first photo are sparse and show signs of weathering or seasonal decay, while the plants in the second photo are lush and vibrant.
  5. There is a visible difference in the variety of plants; the first photo has fewer species, and the second photo includes a wider variety, such as large succulents and fuller grasses or shrubs.
  6. The first photo has a muted color palette due to the frost and lighting, whereas the second photo is very vibrant with rich greens and other colors.
  7. The background in the first photo shows more pots and a blue container, which are not present in the second photo. Instead, the second photo features a wooden fence and additional greenery.
  8. The second photo has additional elements like a Buddha figure in the background and more ornate pots, which are absent in the first photo.
  9. The overall composition and angle of the two photos are similar, but the second photo seems to have a tighter frame focusing more on the Buddha statue and the plants immediately surrounding it.
  10. The textures in the first photo show the natural roughness and frostbite on the plants and surfaces, while the second photo’s textures are smoother and have a more polished appearance.

Example workflow video

Here’s our workflow in action, using an ice cream truck image. This is not an explainer video – It shows the main steps of the above ChatGPT 4 workflow in November 2023.