Audio Models

Hyperclip uses two audio models: ElevenLabs for text-to-speech voiceover and Whisper for audio transcription.

ElevenLabs v3

ID: fal-elevenlabs-v3
Cost: $0.04/request (4 credits)
Used by: Voiceover
Voices: 21 pre-built voice presets

Available voices

Roger, Sarah, Laura, Charlie, George, Callum, River, Harry, Liam, Alice, Matilda, Will, Jessica, Eric, Bella, Chris, Brian, Daniel, Lily, Adam, Bill. Each voice has a distinct character — preview them in the Voiceover node during flow execution to find the best match.

Whisper

ID: fal-whisper
Cost: $0.01/request (1 credit)
Used by: Auto Captions
Output: Timestamped word-level transcription

Whisper transcribes audio into individual words with precise start and end timestamps, enabling word-by-word caption animations.

Cost summary

Model	Per-use cost	Credits
ElevenLabs v3	$0.04	4 credits
Whisper	$0.01	1 credit

Overview

Model Categories

Configuration

Audio Models

Audio Models

ElevenLabs v3

Available voices

Whisper

Cost summary

Overview

Model Categories

Configuration

​Audio Models

​ElevenLabs v3

​Available voices

​Whisper

​Cost summary

Audio Models

ElevenLabs v3

Available voices

Whisper

Cost summary