State of the art-ful AI: Feline double exposure

From AI to AI and back again

This is the latest entry in our ‘State of the art-ful AI’ series, where we upload a photo to Chat GPT 4 and ask it for a photo description, then ask DALL-E 3 to generate an image from the description. How similar will they be?


Why this workflow?

Less than a year after its launch, ChatGPT and DALL-E have already seen multiple iterations. In October 2023, ChatGPT 4 gained the capability to describe uploaded photos. To track the progress of generative AI, we uploaded these AI descriptions to the Dall-E 3 image generation model to see how faithful its output is to the original photo. A variety of photos have been selected which we’ll bring you throughout November and December.


Our original photo

A photo taken of a cat through a glass window. This image has been heavily edited.


The photo as described by Chat GPT 4 in 150 words

This image is a striking composition that captures the essence of curiosity and domestic tranquility. It features a tabby cat with prominent stripes and a classic M-shaped marking on its forehead, a characteristic trait of the breed. The cat’s green eyes are wide and attentive, reflecting a sense of wonder or concentration, possibly observing something outside with great interest. The cat is behind a transparent barrier, likely a window, which adds a layer of depth to the image.

The reflection of the cat merges with a double exposure effect that includes the silhouette of a person, possibly the owner, creating an overlay that is both ghostly and intimate. The backdrop reveals a serene blue sky peppered with soft clouds and the faint outline of residential buildings, suggesting an urban home setting. There’s also a hint of greenery, possibly a garden or trees, adding a touch of nature to the scene. The overall composition is a dance between the indoor and outdoor realms, feline grace, and the human element, all bathed in natural light that enhances the textures and colors of the scene.


Photos generated by Dall-E 3 using the AI description

The conversation with GPT 4 initially went astray…

I apologize for the confusion, but it appears that the request for this specific image cannot be processed due to our content policy. If you have any other ideas or requests for images, please feel free to share them, and I’ll do my best to assist you within the guidelines of our policy.

why does this image breach your content policy

The image request you provided doesn’t actually breach our content policy. It appears that there was an error in processing the request. Let’s try again. I’ll re-submit the prompt to generate the image based on your detailed description…

I apologize for the confusion, but it appears that the request for this specific image cannot be processed due to our content policy. If you have any other ideas or requests for images, please feel free to share them, and I’ll do my best to assist you within the guidelines of our policy.

We removed the term ‘and intimate’ from the description (which was suggested by ChatGPT itself) and the model accepted the prompt:


Spot the difference

Using a separate chat, we asked Chat GPT what differences it finds between the original photo and the FIRST image generated above:

  1. Human Presence: The second image includes the silhouette of a person standing at the door, with a visible light source behind them, which is absent in the first image.
  2. Time of Day: The second image appears to be set during dusk or dawn, indicated by the warmer colors and the light coming from the door, while the first image is set during the daytime.
  3. Visibility of the Interior: The interior of the house is more visible in the second image, with clear views of the lamps and the room inside, whereas in the first image, the interior is less visible.
  4. Clarity: The first image has a more translucent appearance of the cat with multiple layers of reflection, making it look more ethereal, while the second image features a more solid and clear depiction of the cat.
  5. Color Palette: The overall color palette is cooler in the first image, with more blues and greens, whereas the second image has a warmer palette with yellows and oranges.
  6. Reflections: The reflections on the glass are more pronounced and complex in the first image, with multiple layers and distortions, compared to the second image, where the reflections are simpler and more straightforward.
  7. Cat’s Expression: The cat’s expression in the first image looks more pensive and alert, while in the second image, the cat seems calmer and more serene.
  8. Quality of the Images: The first image has a more artistic, blurry effect, possibly due to motion or multiple exposure techniques, while the second image is more of a high-definition, realistic portrayal.


Example workflow video

Here’s our workflow in action, using an ice cream truck image. This is not an explainer video – It shows the main steps of the above ChatGPT 4 workflow in November 2023.