- The enhanced model is slower.
- Unlike the standard model, the enhanced model does not output word level timestamps and confidence scores.
- Max input audio input duration is currently 60 minutes (this will be increased in the near future).
$0.0005 per second of audio equivalent to $1.80 per hour of audio.
Example
Request Example