Remotion Pipeline
Hyperclip uses Remotion — a React-based video rendering framework — to compose and render final videos.How it works
Composition
Remotion compositions define the video structure as React components. Each layer of your video (base footage, captions, audio) is a separate component that renders at the correct time. Composition layers:- Base video — The main video content from your flow
- Caption overlay — Styled text captions synced to audio timestamps
- Voiceover track — Text-to-speech audio layer
- Background music — Music track mixed at lower volume
Rendering
Rendering is handled by Remotion Lambda (serverless) for fast, parallel video encoding:- The composition is sent to Remotion Lambda
- Lambda renders the video frame-by-frame
- Audio tracks are mixed and synced
- The final MP4 is produced and stored
Performance
Rendering typically takes 30–120 seconds depending on video length and complexity. Longer videos with multiple audio layers and caption effects take longer.Technical details
- Videos are rendered at 9:16 aspect ratio
- Frame rate matches the source video
- Audio mixing handles voiceover and background music volume balancing
- Caption animations are rendered frame-by-frame for smooth results