The enhanced model is based on the standard model, but does LLM-based post processing to improve the accuracy of the transcription, get more detailed diarization segments, and infer labels or names of speakers. The enhanced model can work well for languages that are lower in the accuracy of the standard model. Limitations compared to the standard model:Documentation Index
Fetch the complete documentation index at: https://docs.spectropic.ai/llms.txt
Use this file to discover all available pages before exploring further.
- The enhanced model is slower.
- Unlike the standard model, the enhanced model does not output word level timestamps and confidence scores.
- Max input audio input duration is currently 60 minutes (this will be increased in the near future).
$0.0005 per second of audio equivalent to $1.80 per hour of audio.
Example
Request Example