Standard
The standard model is the default model and is suitable for most use cases.
The standard model is based on Whisper large-v3 for transcription and pyannote.audio 3.3.0 for diraization. It is suitable for most use cases and works well for languages like English, German, French, Spanish, Italian, and Dutch.
Costs are $0.0001
per second of audio equivalent to $0.36
per hour of audio.
Example
Request Example