Vosk

by Alpha Cephei

Lightweight offline speech recognition for 20+ languages, runs on a Raspberry Pi.

TL;DR

Lightweight offline speech recognition for 20+ languages, runs on a Raspberry Pi.

Best for real-time streaming transcription on-device or on edge hardware with limited resources. Pricing: free.

Category
Open source
License
Apache-2.0
Stars
★ 14.6k
Last push
2026-02-22
Pricing
free
Platforms
Linux, macOS, Windows, iOS, Android, Edge

What it is

Vosk is a practical, production-friendly offline recognizer built on Kaldi. Predates the Whisper era but still the go-to for true streaming on constrained hardware, IoT, kiosks, and embedded devices. Apache-2.0 licensed.

Best for: Real-time streaming transcription on-device or on edge hardware with limited resources.
Watch out for: WER trails Whisper on most languages; older Kaldi-based architecture; smaller community now.

Install / use

pip install vosk

Features

Speaker diarizationYes
Word-level timestampsYes
Streaming / real-timeYes
Languages supported20
HIPAA eligibleNo

Links

GitHub repo ↗

Vosk vs Whipscribe

FeatureVoskWhipscribe
CategoryOpen sourceTranscription APIs
Pricingfreefree beta
Speaker diarizationYesYes
Word timestampsYesYes
StreamingYesNo
Languages2099
PlatformsLinux, macOS, Windows, iOS, Android, EdgeWeb, API, MCP

Alternatives to Vosk

Whipscribe is a managed faster-whisper + whisperX service. If you want transcripts without running infrastructure, paste a URL or drop a file in the form below — you'll have a transcript in seconds.