Skip to main content

Remotion Pipeline

Hyperclip uses Remotion — a React-based video rendering framework — to compose and render final videos.

How it works

Composition

Remotion compositions define the video structure as React components. Each layer of your video (base footage, captions, audio) is a separate component that renders at the correct time. Composition layers:
  1. Base video — The main video content from your flow
  2. Caption overlay — Styled text captions synced to audio timestamps
  3. Voiceover track — Text-to-speech audio layer
  4. Background music — Music track mixed at lower volume

Rendering

Rendering is handled by Remotion Lambda (serverless) for fast, parallel video encoding:
  1. The composition is sent to Remotion Lambda
  2. Lambda renders the video frame-by-frame
  3. Audio tracks are mixed and synced
  4. The final MP4 is produced and stored

Performance

Rendering typically takes 30–120 seconds depending on video length and complexity. Longer videos with multiple audio layers and caption effects take longer.

Technical details

  • Videos are rendered at 9:16 aspect ratio
  • Frame rate matches the source video
  • Audio mixing handles voiceover and background music volume balancing
  • Caption animations are rendered frame-by-frame for smooth results