Nvidia launches Nemotron 3 Ultra, a 550B-parameter MoE open model; Artificial Analysis says it is the smartest open US model, but trails Chinese model Kimi K2.6
It has roughly 550 billion total parameters, with about 55 billion active at any given time.
Gemini co-lead Oriol Vinyals says Gemini 3's gains come from better pre-training and post-training, contradicting the idea that pre-training gains are falling
which we discussed in our NeurIPS '25 talk with @ilyasut and @quocleix—the team delivered a drastic jump. The delta between 2.5 and 3.0 is [image] Andrej Karpathy / @karpathy : I played with Gemini 3 ...
Baidu unveils two AI chips: the M100 for efficient MoE inference, coming in early 2026, and the M300 for training super-large multimodal models, coming in 2027
The M100 and M300 provide ‘powerful, low-cost and controllable AI computing power’ to support the nation's self-reliance push, the firm says
OpenAI releases gpt-oss-120b and gpt-oss-20b, its first open-weight models since GPT-2; the smaller gpt-oss-20b can run locally on a device with 16GB+ of RAM
gpt-oss-120b and gpt-oss-20b push the frontier of open-weight reasoning models Simon Willison / Simon Willison's Weblog : OpenAI's new open weight (Apache 2) models are really good OpenAI on GitHub : ...
Z.ai, formerly known as Zhipu and that has raised $1.5B from Tencent and others, releases GLM-4.5, an open-source AI model that it says is cheaper than DeepSeek
chinese models really are taking over huh Simon Willison / @simonwillison.net : Pretty decent pelicans from the new GLM-4.5 and GLM-4.5 Air models. Both models are MIT licensed, released by Chinese A...
Alibaba debuts the Qwen3-Coder model for agentic coding, including a 480B-parameter MoE variant, and open sources Qwen Code, a CLI tool adapted from Gemini CLI
Qwen 39.4k — Text Generation Transformers Safetensors qwen3_moe conversational Coco Feng / South China Morning Post : Alibaba upgrades flagship Qwen3 model to outperform OpenAI, DeepSeek in maths, c...
Moonshot's Kimi K2 uses a 1T-parameter MoE architecture with 32B active parameters and outperforms models like GPT-4.1 and DeepSeek-V3 on key benchmarks
Moonshot AI, the Chinese artificial intelligence startup behind the popular Kimi chatbot, released an open-source language model on Friday …
Mark Zuckerberg says Meta will share news about a Llama 4 Reasoning model “in the next month”
David Sacks Jesus Rodriguez / TheSequence : The Sequence Radar #526: Llama 4 Scout and Maverick are Here! Aman Gupta / Livemint : ChatGPT vs Meta AI: Which AI chatbot is better after the Llama 4 launc...
Mac Studio with M3 Ultra and 512GB of unified memory review: opens up new workflows on a ~$10,000 desktop, like running a quantized version of DeepSeek R1 671B
if confusing — performance magic Federico Viticci / MacStories : The M3 Ultra Mac Studio for Local LLMs Brandon Hill / Tom's Hardware : Apple Mac Studio review: M3 Ultra offers amazing performance, si...
Google launches Gemini 1.5 for developers and enterprise users, with a context window of up to 1M tokens, and says Gemini 1.5 Pro is on par with Gemini Ultra
but you won't be able to enjoy it just yet Ryan Morrison / Tom's Guide : I test AI for a living and Google's Gemini Pro 1.5 is a true turning point Shaurya Tomer / Hindustan Times : Google DeepMind an...