Mistral debuts Voxtral Transcribe 2, a family of speech-to-text models with speaker diarization and ultra-low latency, under the Apache 2.0 open-weight license
The Deep View Sabrina Ortiz
Related Coverage
- Voxtral transcribes at the speed of sound. Mistral AI
- These New AI Transcription Models Are Built for Speed and Privacy CNET · Jon Reed
- Mistral drops Voxtral Transcribe 2, an open-source speech model that runs on-device for pennies VentureBeat · Michael Nuñez
- Voxtral transcribes at the speed of sound (via) Mistral just released Voxtral Transcribe 2 … Simon Willison's Weblog · Simon Willison
- Mistral's New Ultra-Fast Translation Model Gives Big AI Labs a Run for Their Money Wired · Joel Khalili
- Voxtral Transcribe 2 Hacker News
- Mistral AI Launches Voxtral Transcribe 2: Pairing Batch Diarization And Open Realtime ASR For Multilingual Production Workloads At Scale MarkTechPost · Michal Sutter
- Mistral AI's Voxtral Transcribe 2 Launch Breaks Sound Barrier eWeek
- Voxtral Transcribe 2 offers speech recognition at $0.003 per minute The Decoder · Jonathan Kemper
Discussion
-
@mistralai
@mistralai
on x
Introducing Voxtral Transcribe 2, next-gen speech-to-text models by @MistralAI. State-of-the-art transcription, speaker diarization, sub-200ms real-time latency. Details in 🧵 [video]
-
@mistralai
@mistralai
on x
Voxtral Realtime is built for voice agents and live applications. Its natively streaming architecture delivers latency configurable to sub-200ms. And at 480ms, it stays within 1-2% WER of our offline model. We release the model as open weights under Apache 2.0. [image]
-
@simonw
Simon Willison
on x
The demo on https://huggingface.co/... is worth a try - ignore the “No microphone found” message, clicking “Record” and allowing your browser to use a microphone fixes that. It transcribes very accurately in almost real-time. It's really impressive.
-
r/BuyFromEU
r
on reddit
Voxtral transcribes at the speed of sound.