kwindla · TEXXR

2025-08-29

Voice-only programming with the new OpenAI Realtime API ... I spend a lot of time these days pair programming with LLMs. Often I'm talking rather than typing. This “voice dictation” use case has become an important vibe benchmark for me. Being able to create text input just by [video]

2025-08-29 View on X

ZDNET

OpenAI makes its Realtime API generally available with features like MCP support and debuts gpt-realtime, its most advanced speech-to-speech model, in the API

[video] @liodakis : Congrats to @pbbakkum on shipping gpt-realtime! It's been awesome watching him and the multimodal team sweat the details and get to a GA quality multimodal mode...

View original

2025-02-05

Google's full release of Gemini 2.0 Flash is a great thing for the voice AI ecosystem. Up to this point, almost every production voice AI agent has used GPT-4o. Voice AI apps need an LLM with fast TTFT, good instruction following, reliable function calling, and natural

2025-02-05 View on X

The Verge

Google releases Gemini 2.0 Flash via its API, an experimental Gemini 2.0 Pro version via its apps, Gemini 2.0 Flash Thinking, and 2.0 Flash-Lite in AI Studio

Gemini 2.0 AI updates include cheaper access for developers, and AI that can use other Google apps like YouTube.

View original