Google rolls out Gemini 3.1 Pro, which it says is “a step forward in core reasoning”, for all users in the Gemini app; the .1 increment is a first for Google
Google rolls out Gemini 3.1 Pro, which it says is “a step forward in core reasoning”, for all users in the Gemini app; the .1 increment is a first for Google
In November, Google introduced Gemini 3 Pro in preview, with Gemini 3 Flash following a month later.
Google rolls out Gemini 3.1 Pro, which it says is “a step forward in core reasoning”, for all users in the Gemini app; the .1 increment is a first for Google
In November, Google introduced Gemini 3 Pro in preview, with Gemini 3 Flash following a month later.
Anthropic launches Claude Sonnet 4.6 with improvements in coding, computer use, instruction following, and more; it features a 1M token context window in beta
Claude Sonnet 4.6 is our most capable Sonnet model yet. It's a full upgrade of the model's skills across coding, computer use …
Anthropic launches Claude Sonnet 4.6 with improvements in coding, consistency, and more, for Free and Pro users; it features a 1M token context window in beta
Claude Sonnet 4.6 is our most capable Sonnet model yet. It's a full upgrade of the model's skills across coding, computer use …
Google updates Gemini 3 Deep Think to better solve modern science, research, and engineering challenges and expands it via the Gemini API to some researchers
Our most specialized reasoning mode is now updated to solve modern science, research and engineering challenges.
Google updates Gemini 3 Deep Think to better solve modern science, research, and engineering challenges and expands it via the Gemini API to some researchers
Our most specialized reasoning mode is now updated to solve modern science, research and engineering challenges.
Google unveils Gemini 3, its “most intelligent” and “factually accurate” model yet, with improvements across coding and reasoning, and offering less “flattery”
The flagship Gemini 3 Pro model is coming to the Gemini app and Search, with improvements across coding, reasoning, and less ‘flattery.’
OpenAI touts GPT-5's scores on math, coding, and health benchmarks: 94.6% on AIME 2025 without tools, 74.9% on SWE-bench Verified, and 46.2% on HealthBench Hard
After literally years of hype and speculation, OpenAI has officially launched a new lineup of large language models (LLMs) …
Artificial Analysis benchmarks: Grok 4 is now the leading AI model, a first for xAI; Grok 4's per-token pricing is more expensive than Gemini 2.5 Pro's and o3's
xAI gave us early access to Grok 4 - and the results are in. Grok 4 is now the leading AI model. We have run our full suite of benchmarks and Grok 4 achieves an Artificial Analysis...
xAI introduces Grok 4, trained on its Colossus supercomputer, with multimodal features, faster reasoning, Grok 4 Voice, Grok 4 Code, a new interface, and more
Deeper thinking and greater reasoning is promised — An hour after the live stream was supposed to start last night (July 9) …
OpenAI releases GPT‑4.1, GPT‑4.1 mini, and GPT‑4.1 nano, which excel at coding, instruction following, and long context understanding, available via its API
OpenAI on Monday launched a new family of models called GPT-4.1. Yes, “4.1” — as if the company's nomenclature wasn't confusing enough already.
The Arc Prize Foundation says its new ARC-AGI-2 test stumps most AI models; humans get 60% of the questions right but GPT-4.5 and Claude 3.7 Sonnet score ~1%
[image] François Chollet / @fchollet : Unlike ARC-AGI-1, this new version is not easily brute-forced. Current top AI approaches score 0-4%. All base LLMs (GPT-4.5, Claude 3.7 Son...
OpenAI unveils o3 and o3-mini, trained to “think” before responding via what OpenAI calls a “private chain of thought”, and plans to launch them in early 2025
12 Days of OpenAI: Day 12 Naomi Li Gan / Tech in Asia : OpenAI unveils AI model for advanced reasoning Bojan Stojkovski / Interesting Engineering : OpenAI unveils o3 reasoning AI m...
OpenAI unveils o3 and o3-mini, trained to “think” before responding via what OpenAI calls a “private chain of thought”, and plans to launch them in early 2025
OpenAI announced its new o3 models on Friday. — In a tweet ahead of its final livestream for its …