Gemini co-lead Oriol Vinyals says Gemini 3's gains come from better pre-training and post-training, contradicting the idea that pre-training gains are falling
which we discussed in our NeurIPS '25 talk with @ilyasut and @quocleix—the team delivered a drastic jump. The delta between 2.5 and 3.0 is [image] Andrej Karpathy / @karpathy : I played with Gemini 3 ...
First impressions of ChatGPT Atlas, as browser agents remain confusing, with insurmountable security and privacy risks including prompt injection attacks
a web browser with ChatGPT built in, not bolted on. The browser is the agent now. Tabs are prompts. The search bar is dead. Welcome to the post-URL era. P.S the browser wrote this on its own Arlan / @...
Samsung introduces the Tiny Recursion Model, a 7M-parameter model that can outperform LLMs 10,000x larger, like Gemini 2.5 Pro and o3-mini, on specific problems
The trend of AI researchers developing new, small open source generative models that outperform far larger …
OpenAI and Apollo Research trained o3 and o4-mini versions to not engage in “scheming”, or secretly pursuing some other agenda, reducing “covert actions” ~30X
ZDNET's key takeaways — Several frontier AI models show signs of scheming.
GPT-5 Thinking in ChatGPT is shockingly good at search and demonstrates the potential of combining tool calling with chain-of-thought reasoning
“Don't use chatbots as search engines” was great advice for several years... until it wasn't. — I wrote about how good OpenAI's o3 was at using … X: @simonw , @simonw , and @simonw Mastodon: @remixt...
DeepSeek details V3.1 and says it surpasses R1 on key benchmarks and is customized to work with next-gen Chinese-made AI chips, after unveiling it on August 19
Introducing DeepSeek-V3.1: our first step toward the agent era! 🚀 Tobias Mann / The Register : DeepSeek's new V3.1 release points to potent new Chinese chips coming soon Hugging Face : DeepSeek-V3.1 ...
GPT-5 review: GPT-5-Thinking is a substantial upgrade over o3-pro, Auto is only useful for free tier users, picking the right model still matters, and more
What do I ultimately make of all the new versions of GPT-5? — The practical offerings and how they interact continues to change by the day.
GPT-5's release was underwhelming, offering incremental improvements and failing to meet expectations, showing that pure scaling simply isn't the path to AGI
and he's not alone Maximilian Schreiner / The Decoder : GPT-5 is here and Gary Marcus is not impressed Laura Varley / Silicon Republic : Altman admits GPT-5 currently ‘way dumber’ amid rough roll-out ...
OpenAI says ChatGPT Pro users can select old models for now but plans to deprecate them in 60 days; Sam Altman says Plus users will be able to keep using GPT-4o
The “Best Friend” of Many ChatGPT Users Now Comes at a Price Jackson Chen / Engadget : OpenAI brings GPT-4o back online after users melt down over the new model Amanda Caswell / Tom's Guide : ChatGPT-...
GPT-5 hands-on: it exudes competence but doesn't feel like a dramatic leap ahead of other LLMs, and the pricing is aggressively competitive with other providers
And It Changes Everything Tyler Cowen / Marginal Revolution : GPT-5, a short and enthusiastic review GPT-5 : GPT-5 — Our hands-on review of OpenAI's newest model based on weeks of testing — The Ve...