Two people in a vibrant, neon-lit room with VR gear; large “REALITY REMIX” text overlays the image.

AI video editing using Runway: Part Three

Image and text to video using runway

We continue our series introducing one of the world’s most popular online generative AI video production toolkits, as we create a short video for a brand new immersive experience.


The AI tools and models used in this series

We’ll use runway.ml to generate video clips and assemble them into a sequence, and we’ll use additional AI tools to help with initial ideation and to create text prompts to use at runway to help generate high quality video clips.

The main steps in the project:

  1. Generate an overview for an ‘immersive experience’ promotional video using perplexity.ai
  2. Generate still images using the flux-1.1-pro model at replicate.com to use as ‘first frame’ prompts at runway
  3. Generate short video clips using runway’s latest Gen-4 model
  4. Use runway’s online video editor to sequence the promotional video
  5. Use the speech-02-hd AI model at replicate.com to generate a realistic-sounding voiceover
  6. Use udio.com to generate a background music track
  7. Add text titles and export the video file

Part 3: Video clip generation at runway.ml

We’ve set a $15 budget to use with runway.ml (I.e. A single month of the ‘standard’ plan) as this is realistic to try the available tools. Runway’s ongoing monthly pricing model is as follows:

A pricing page displays five subscription plans: Free, Standard, Pro, Unlimited, and Enterprise with feature lists.

Having tried runway’s still image generation in the previous part of this series, we tried the same image prompt with one of our favourite standalone image-generation models, Flux 1.1 pro, accessed via replicate.com – Here’s an example of the result:

A computer screen displays an AI image generation interface with a futuristic digital art scene on the right side.

A single still image generated in Flux pro 1.1 at replicate.com costs around $0.04, while each runway image costs around $0.15 (and they’re created in batches of four). Our advice is to use a standalone image generation model if you already have a prefered one then use runway for its video generation abilities.

We’ll use this image generated at replicate.com as the prompt for our first video generation. Here it is:

People wearing VR headsets walk through a room with colorful digital screens and futuristic light displays.

We’ve set runway to use Gen-4 (the ‘best’ video generation model) and generate a 5 second video in 16:9 aspect ratio:

A computer screen displays an AI image generation tool creating futuristic, virtual reality-themed digital artworks.

Click the ‘Generate’ button to create the video clip. The tip below suggests adding camera movements to an (optional) text prompt along with the image prompt:

A screen shows a video generation progress bar at 19% in an AI tool, with an example image on the left side.

This video generation took around 90 seconds (Your mileage may vary depending on time of day and network load). Here’s the five second video clip runway generated, downloaded directly from the site:

This video demonstrates improvement in the motion generated when we tried runway in 2024. People do not visibly warp or exhibit obvious errors and move realistically within the scene. If you look closely at the people in the distance you may notice unnatural movement, but this is only visible when you look for it.

A generated video clip can be upscaled to 4K resolution inside runway for an additional 10 credits. This process can be achieved using other models (including at replicate.com) or in video editing software including Vegas Pro and Premiere Pro.

A video player interface shows 4K quality, volume, fullscreen, and download icons on a colorful background.

Next, we add camera movement guidance to accompany the text prompt, before regenerating the video. Gen 4 text prompting guide

The TLDR version of the prompting guide linked above is to ‘keep it simple’ and include information about subject motion, camera motion, scene motion and style descriptors:

A webpage displays prompt engineering tips, example prompts, and side-by-side input and output images of a mechanical bull.

We add the camera direction prompt: ‘A steadycam zooms into the scene at a medium pace as the room rotates slowly. The people are walking slowly. Dynamic advertising footage‘:

People wearing VR headsets walk through a brightly lit, colorful digital art exhibition with swirling light patterns.
A video editing interface shows a sci-fi scene with people and colorful lights; video progress is at 27 percent.

Here’s the updated video clip based on the additional camera information:

The text instructions have been followed though the room is actually rotating less than in the first clip where rotation was not specifically requested. We’ll select one of the two clips to use as the opening ‘hook’ shot for our immersive experience promo video.

We’ll continue using the workflow above to generate video clips for the next shots for our short video promo. As a reminder…

TimeVisuals / ActionsVoiceover / Text (Example)Audio / SFX
0:00–0:03Hook: Fast-paced montage of immersive visuals-people stepping into a vibrant, interactive environment (e.g., VR headsets, projection mapping, tactile displays).“What if you could step inside the story?”Ambient build-up, whoosh SFX
0:03–0:08Reveal the Experience: Show signature moments-participants reaching out, reacting with awe, 360° camera spins, glimpses of the environment’s highlights.“Welcome to [Experience Name]-where reality blurs with imagination.”Uplifting, cinematic music rises
0:08–0:15Features/Highlights: Quick cuts of interactive elements-touch, sound, movement; show diversity of participants (families, friends, individuals).“Explore, interact, and lose yourself in a world designed to ignite your senses.”Layered sound effects (laughter, interactive sounds)
0:15–0:22Emotional Connection: Faces of wonder, laughter, and surprise; slow-motion shot of a key moment (e.g., a room transforming, a dramatic reveal).“Every moment is unforgettable. Every step, a new adventure.”Music swells, heartbeat or pulse SFX
0:22–0:27Urgency/Exclusivity: Quick flashes of tickets, countdown, or “Limited Time Only” overlay.“Tickets are limited. Don’t miss your chance.”Music intensifies, subtle ticking
0:27–0:30Call to Action: Logo animation, website, and social handle appear.“Book now at [website]. Experience the extraordinary.”Music resolves, memorable audio logo

In part four of this series we’ll see what runway comes up with for the remaining video clips!