Models

The Spectropic API currently supports two models for transcription and diraization. The standard model is the default model and is suitable for most use cases. The enhanced model is more accurate but slower.

Model	Languages	Price per second	Price per hour
`standard`	all	$0.0001 / second of audio	$0.36 / hour of audio
`enhanced`	all	$0.0005 / second of audio	$1.80 / hour of audio

Standard model

The standard model is based on Whisper large-v3-turbo for transcription and pyannote.audio 3.3.0 for diraization. It is suitable for most use cases and works well for languages like English, German, French, Spanish, Italian, and Dutch. Costs are $0.0001 per second of audio equivalent to $0.36 per hour of audio.

Example

Request Example

  curl  --request POST \
    --url https://api.spectropic.ai/v1/transcribe \
    --header: 'Authorization: Bearer <apikey>' \
    --header: 'Content-Type: application/json' \
    --data '{
      "url": "https://example.com/file.mp3",
      "model": "standard",
      "numSpeakers": 2,
      "language": "en",
      "vocabulary": "Spectropic, AI, LLama, Mistral, Whisper."
    }'

Enhanced model

The enhanced model is based on the standard model, but does LLM-based post processing to improve the accuracy of the transcription, get more detailed diarization segments, and infer labels or names of speakers. The enhanced model can work well for languages that are lower in the accuracy of the standard model.

Limitations compared to the standard model:

The enhanced model is slower.
Unlike the standard model, the enhanced model does not output word level timestamps and confidence scores.
Max input audio input duration is currently 60 minutes (this will be increased in the near future).

Costs are $0.0005 per second of audio equivalent to $1.80 per hour of audio.

Warning: The enhanced model is experimental and may fail or produce incorrect results (you won’t be charged on transcribe failure). Feel free to try it out and let us know what you think (mail to [email protected] or on X @thomas_mol).

Example

Request Example

  curl  --request POST \
    --url https://api.spectropic.ai/v1/transcribe \
    --header: 'Authorization: Bearer <apikey>' \
    --header: 'Content-Type: application/json' \
    --data '{
      "url": "https://example.com/file.mp3",
      "model": "enhanced",
      "numSpeakers": 2,
      "language": "en",
      "vocabulary": "Spectropic, AI, LLama, Mistral, Whisper."
    }'

API Reference

Schemas

Resources

Standard model

Example

Enhanced model

Limitations compared to the standard model:

Example

API Reference

Schemas

Resources

​Standard model

​Example

​Enhanced model

​Limitations compared to the standard model:

​Example

Standard model

Example

Enhanced model

Limitations compared to the standard model:

Example