Image-to-Video Generation
The Image-to-Video node takes a still image and animates it into a video clip. It uses the same 10 engines as Text-to-Video, with the addition of an image input for visual reference.Inputs
| Handle ID | Data Type | Label |
|---|---|---|
image-in | Image | Image |
text-in | Text | Prompt |
Outputs
| Handle ID | Data Type | Label |
|---|---|---|
video-out | Video | Video |
Available engines
| Engine ID | Label | Cost (5s, standard) |
|---|---|---|
kling-o3 | Kling O3 | 5 credits |
kling-3.0 | Kling 3.0 | 5 credits |
kling-2.6 | Kling 2.6 | 4 credits |
fal-seedance-1.5 | Seedance 1.5 | 4 credits |
fal-wan-2.6 | Wan 2.6 | 4 credits |
grok-imagine-video | Grok Imagine | 4 credits |
sora-2 | Sora 2 | 7 credits |
fal-veo3 | Veo 3.1 (Google) | 8 credits |
fal-minimax-hailuo-2 | MiniMax Hailuo 2 | 5 credits |
fal-ltx-2.3 | LTX 2.3 | 2 credits |
How it works
- An image is received from the input handle (from Image Generation, Media Upload, etc.)
- A text prompt describes the desired motion and action
- The AI model animates the image according to the prompt
- The resulting video clip is output
Configuration options
Same as Text-to-Video — engine selection and all advanced options apply identically.Credit cost
Same pricing as Text-to-Video. Cost = per-second rate x duration.Tips
- The text prompt should describe motion — what should move, how the camera should pan, etc.
- Combine with Image Generation for a two-step pipeline: generate image → animate to video
- Image-to-video typically produces more consistent results than text-to-video since it has a visual reference
- Works especially well with the Multi-Shot Generation pipeline
Example use cases
- Animating AI-generated character images into scenes
- Adding subtle motion to product shots
- Creating video from illustrated storyboard frames