Researchers: OpenAI's o1 analyzes languages as well as a human expert, including inferring the phonological rules of made-up languages without prior knowledge
If language is what makes us human, what does it mean now that large language models have gained “metalinguistic” abilities?
AI reasoning models cost more to benchmark, making it harder to independently verify claims; Artificial Analysis says evaluating OpenAI's o1 costs $2,767.05
AI labs like OpenAI claim that their so-called “reasoning” AI models, which can “think” through problems step by step …
Alibaba releases open-source reasoning model QwQ-32B on Hugging Face and ModelScope, claiming comparable performance to DeepSeek-R1 but with lower compute needs
Introduction QwQ is the reasoning model of the Qwen series. Paul Barker / InfoWorld : Alibaba says its new AI model rivals DeepSeeks's R-1, OpenAI's o1 Jose Antonio Lanz / Decrypt : Alibaba's Latest A...
Grok-3 hands-on: its thinking capability feels state of the art and rivals OpenAI's o1 pro models, DeepSearch offers a blend of search and reasoning, and more
I was given early access to Grok 3 earlier today, making me I think one of the first few who could run a quick vibe check.
Microsoft makes OpenAI's o1 model, branded as “Think Deeper”, free for all Copilot users, after launching o1 in October 2024 as a paid Copilot Pro feature
Think Deeper first launched in October as a paid Copilot Pro feature, but it's now free.
A bear case for Nvidia: hardware competitors, LLM code translation to avoid CUDA lock-in, DeepSeek's training and inference efficiency breakthroughs, and more
DeepSeek recently released models claiming up to 45x more efficient training and inference compared to today's best-known large language models (like OpenAI's o1). … Forums: Hacker News : The impact o...
Industry insiders say DeepSeek's focus on research makes it a dangerous competitor as it's willing to share breakthroughs rather than protect them for profits
China is pulling the same trick. — www.ft.com/content/747a... Mastodon: Brian Kung / @briankung@hachyderm.io : “There's a pretty delicious, or maybe disconcerting irony to this, given OpenAI's found...
ByteDance debuts Doubao-1.5-pro, claiming its latest flagship AI model outperforms OpenAI's o1 in AIME benchmarks, joining DeepSeek in China's AI reasoning push
TikTok owner ByteDance on Wednesday released an update to its flagship AI model aimed at challenging Microsoft-backed OpenAI's …
Google releases Gemini 2.0 Flash Thinking, an experimental “reasoning” model that “explicitly shows its thoughts” and can use them to strengthen its reasoning
Quick: what sort of prompts should you run against GPT-4o vs Gemini 1.5 Flash vs o1 vs o1-pro vs gemini-2.0-flash-thinking-exp? X: Jeff Dean / @jeffdean : Introducing Gemini 2.0 Flash Thinking, an exp...
An evaluation of six frontier AI models for in-context scheming when strongly nudged to pursue a goal: only OpenAI's o1 was capable of scheming in all the tests
It presents a new safety challenge that OpenAI is trying to address. — techcrunch.com/2024/12/05/o... Anders Sandberg / @arenamontanus : In an IVA discussion on AI yesterday evening professor Kristi...