Skip to main content

Audio Models

Hyperclip uses two audio models: ElevenLabs for text-to-speech voiceover and Whisper for audio transcription.

ElevenLabs v3

  • ID: fal-elevenlabs-v3
  • Cost: $0.04/request (4 credits)
  • Used by: Voiceover
  • Voices: 21 pre-built voice presets

Available voices

Roger, Sarah, Laura, Charlie, George, Callum, River, Harry, Liam, Alice, Matilda, Will, Jessica, Eric, Bella, Chris, Brian, Daniel, Lily, Adam, Bill. Each voice has a distinct character — preview them in the Voiceover node during flow execution to find the best match.

Whisper

  • ID: fal-whisper
  • Cost: $0.01/request (1 credit)
  • Used by: Auto Captions
  • Output: Timestamped word-level transcription
Whisper transcribes audio into individual words with precise start and end timestamps, enabling word-by-word caption animations.

Cost summary

ModelPer-use costCredits
ElevenLabs v3$0.044 credits
Whisper$0.011 credit