Mixed Media Analysis
The Mixed Media Analysis node uses multimodal AI to analyze a combination of images, videos, and text. It can describe, compare, extract information, or generate prompts based on mixed media inputs.Inputs
| Handle ID | Data Type | Label |
|---|---|---|
text-in | Text | Instruction |
Outputs
| Handle ID | Data Type | Label |
|---|---|---|
text-out | Text | Prompt / Analysis |
Available engines
| Engine ID | Label | Cost |
|---|---|---|
gemini-flash | Gemini Flash | 1 credit |
gemini-pro | Gemini Pro | 2 credits |
How it works
- Connect text instructions, images, and/or videos to the input handles
- Write an instruction describing what you want the AI to analyze or produce
- The multimodal AI processes all inputs together
- The result (text) is output — this could be a description, analysis, or generated prompt
Configuration options
| Parameter | Description |
|---|---|
| Engine | Gemini Flash (faster) or Gemini Pro (more capable) |
| Instruction | What to do with the media inputs |
Credit cost
| Engine | Credits |
|---|---|
| Gemini Flash | 1 credit |
| Gemini Pro | 2 credits |
Tips
- This node is interactive — you provide instructions during flow execution
- Use it to generate video prompts based on reference images
- Great for analyzing existing content and producing descriptions
- Dynamic inputs allow flexible multi-modal workflows
Example use cases
- Analyzing a reference video and generating a text description
- Comparing multiple images and selecting the best one via AI
- Generating video prompts that match the style of reference images
- Extracting information from screenshots or frames