/
Navigation
Chronicles
Browse all articles
Explore
Semantic exploration
Research
Entity momentum
Nexus
Correlations & relationships
Story Arc
Topic evolution
Drift Map
Semantic trajectory animation
Posts
Analysis & commentary
Pulse API
Tech news intelligence API
Browse
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
Concept Search
Semantic similarity search
High Impact Stories
Top coverage by position
Sentiment Analysis
Positive/negative coverage
Anomaly Detection
Unusual coverage patterns
Analysis
Rivalry Report
Compare two entities head-to-head
Semantic Pivots
Narrative discontinuities
Crisis Response
Event recovery patterns
Connected
Search: /
Command: ⌘K
Embeddings: large
TEXXR

Chronicles

The story behind the story

days · browse · Enter similar · o open

OpenAI adds gpt-4o-mini-tts, a text-to-speech model that it says delivers more nuanced and realistic-sounding speech, and two speech-to-text models to its API

www.implicator.ai/claude-gets- ...  tip @techmeme.com @fry69.dev : Unrelated to OpenAI, here is an interesting text to speech model/generator with supports “emotion” sounds like <laugh>, <chuckle>, <sigh>, etc  —  Demo space on HuggingFace -> huggingface.co/spaces/prith... @implicator : OpenAI drops next-gen voice models that actually work.  Scary-good transcription + AI voices with personality.  Plus video tech on deck.  Silicon Valley's game of catch-up begins 🎯  —  www.implicator.ai/openai-upgra...  tip @techmeme.com X: @openaidevs : Three new state-of-the-art audio models in the API: 🗣️ Two speech-to-text models—outperforming Whisper 💬 A new TTS model—you can instruct it *how* to speak 🤖 And the Agents SDK now supports audio, making it easy to build voice agents. Try TTS now at https://openai.fm/. Jijo Sunny / @jijosunny : Today @OpenAI dropped 3 impressive voice models, and we were the first to test them internally (thanks!).  Bottom line: It's the best STT model by far—and we've tested them all.  I was pleasant surprised by how well it handled context and nuance in smaller languages like Malayalam. Samrat Man Singh / @samratmansingh : OpenAI's new TTS looks(and sounds) pretty great for the price. Also, I hope this pushes other providers to just price API usage by minute. Every other TTS provider(ElevenLabs, Cartesia, etc) currently have monthly credits pricing. [image] Justin Uberti / @juberti : Lots of new audio stuff today: - ASR: gpt-4o-transcribe with SoTA performance - TTS: gpt-4o-mini-tts with playground at https://openai.fm/ - Realtime API: new noise reduction and semantic VAD - Agents SDK: add voice to an agent with 10 LOC Details: https://platform.openai.com/ ... LinkedIn: Marc Manara : 2025 is the year of voice... and agents.. and well, voice agents.  —  OpenAI launched 3 new audio models today - 2 new speech-to-text models and a new text-to-speech model. … Olivier Godement : Voice AI agents are getting real and fun!  We're launching new audio models and tools to make it easy to build capable voice agents. …

TechCrunch Kyle Wiggers

Discussion

  • @implicator @implicator on bluesky
    Claude gets web-smart.  Real-time search + AI synthesis = game changer.  Financial analysts, sales teams, researchers: Your AI assistant just learned to time travel.  The knowledge cutoff era ends now 🔍  —  www.implicator.ai/claude-gets- ...  tip @techmeme.com
  • @fry69.dev @fry69.dev on bluesky
    Unrelated to OpenAI, here is an interesting text to speech model/generator with supports “emotion” sounds like <laugh>, <chuckle>, <sigh>, etc  —  Demo space on HuggingFace -> huggingface.co/spaces/prith...
  • @implicator @implicator on bluesky
    OpenAI drops next-gen voice models that actually work.  Scary-good transcription + AI voices with personality.  Plus video tech on deck.  Silicon Valley's game of catch-up begins 🎯  —  www.implicator.ai/openai-upgra...  tip @techmeme.com
  • @openaidevs @openaidevs on x
    Three new state-of-the-art audio models in the API: 🗣️ Two speech-to-text models—outperforming Whisper 💬 A new TTS model—you can instruct it *how* to speak 🤖 And the Agents SDK now supports audio, making it easy to build voice agents. Try TTS now at https://openai.fm/.
  • @jijosunny Jijo Sunny on x
    Today @OpenAI dropped 3 impressive voice models, and we were the first to test them internally (thanks!).  Bottom line: It's the best STT model by far—and we've tested them all.  I was pleasant surprised by how well it handled context and nuance in smaller languages like Malayala…
  • @samratmansingh Samrat Man Singh on x
    OpenAI's new TTS looks(and sounds) pretty great for the price. Also, I hope this pushes other providers to just price API usage by minute. Every other TTS provider(ElevenLabs, Cartesia, etc) currently have monthly credits pricing. [image]
  • @juberti Justin Uberti on x
    Lots of new audio stuff today: - ASR: gpt-4o-transcribe with SoTA performance - TTS: gpt-4o-mini-tts with playground at https://openai.fm/ - Realtime API: new noise reduction and semantic VAD - Agents SDK: add voice to an agent with 10 LOC Details: https://platform.openai.com/ ..…