Gladia
by Gladia
Whisper-based API with diarization, 99-language coverage, pay-per-minute.
TL;DR
Whisper-based API with diarization, 99-language coverage, pay-per-minute.
Best for teams who like the Whisper model family but don't want to run GPUs. Pricing: from $0.0102/min.
Category
Transcription APIs
License
—
Stars
—
Last push
—
Pricing
from $0.0102/min
Platforms
API
What it is
Gladia wraps Whisper-class models in a developer-friendly API with diarization, 99 languages, and competitive per-minute pricing. A reasonable alternative to self-hosting faster-whisper when you want someone else to operate the GPUs. Last price check: 2026-04-20.
Best for: Teams who like the Whisper model family but don't want to run GPUs.
Watch out for: Smaller ecosystem than AssemblyAI/Deepgram; HIPAA on enterprise tiers only.
Watch out for: Smaller ecosystem than AssemblyAI/Deepgram; HIPAA on enterprise tiers only.
Install / use
Features
| Speaker diarization | Yes |
| Word-level timestamps | Yes |
| Streaming / real-time | Yes |
| Languages supported | 99 |
| HIPAA eligible | No |
Gladia vs Whipscribe
| Feature | Gladia | Whipscribe |
|---|---|---|
| Category | Transcription APIs | Transcription APIs |
| Pricing | from $0.0102/min | free beta |
| Speaker diarization | Yes | Yes |
| Word timestamps | Yes | Yes |
| Streaming | Yes | No |
| Languages | 99 | 99 |
| Platforms | API | Web, API, MCP |
Sources & dates for the comparison above
- diarization: “Gladia's diarization feature labels each utterance with a speaker identifier.” — source (checked 2026-04-23)
- word timestamps: “Per-word timestamps are included with start and end seconds.” — source (checked 2026-04-23)
- streaming: “Gladia provides a WebSocket streaming endpoint for live audio.” — source (checked 2026-04-23)
- pricing: “Pay-as-you-go pricing from $0.612 per hour (~$0.0102/min).” — source (checked 2026-04-23)
Alternatives to Gladia
Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.