Nvidia debuts Nemotron 3 Super, a 120B-parameter hybrid MoE open-weight model; filing: Nvidia plans to spend $26B over the next five years to build open models
Wired Will Knight
Related Coverage
- NVIDIA Nemotron 3 Super NVIDIA Nemotron
- Nvidia launches Nemotron 3 Super, a 120B open model for large-scale AI systems The New Stack · Frederic Lardinois
- New NVIDIA Nemotron 3 Super Delivers 5x Higher Throughput for Agentic AI NVIDIA · Kari Briski
- NVIDIA releases Nemotron 3 Super, 120B open model with 1M-token context VideoCardz.com
- NVIDIA Nemotron 3 Super now available on Workers AI Cloudflare
- Nvidia boosts open models with Nemotron 3 Super The Deep View
Discussion
-
@kuchaev
Oleksii Kuchaiev
on x
Nemotron 3 Super is here — 120B total / 12B active, Hybrid SSM Latent MoE, designed for Blackwell. Truly open: permissive license, open data, open training infra. See analysis on @ArtificialAnlys Details in thread 🧵below: [image]
-
@igtmn
Igor Gitman
on x
Nemotron 3 Super is out! It's really good and it will only get better from here. And we release all the details - tech report, training code, training data, model weights. Everything you need to build a model like this yourself!
-
@ggerganov
Georgi Gerganov
on x
In collaboration with NVIDIA we announce support for the new NVIDIA Nemotron 3 Super model in llama.cpp NVIDIA Nemotron 3 Super is a 120B open MoE model activating just 12B parameters to deliver maximum compute efficiency and accuracy for complex multi-agent applications.
-
@mweinbach
Max Weinbach
on x
Trying out the new Nvidia Nemotron 3 Super model on my Mac Studio! [image]
-
@samhogan
Sam Hogan
on x
We've been testing Nemotron 3 Super for the last few weeks. TL;DR: it's easily the best Open Source American model for its size. Super fast. Great for agents and tool-calling use cases. We'll be shipping a series of post-trained Nemtron models in the coming weeks.
-
@manuelfaysse
Manuel Faysse
on x
If you ever wondered how LLMs became so good at MMLU, the Nvidia Nemotron 3 Super reports that 11.1% of Pretraining Phase 1 data (20T tokens) is MMLU-style SFT data, so over 2T tokens of synthetic tokens specifically designed to reach the coveted 86% performance. [image]
-
@nvidiaaidev
@nvidiaaidev
on x
This latest addition to the Nemotron family isn't just a bigger Nano. ✅ Up to 5x higher throughput and 2x accuracy than the previous version ✅ Latent MoE that calls 4x as many expert specialists for the same inference cost ✅ Multi-token prediction that dramatically reduces [imag…
-
@nvidianewsroom
@nvidianewsroom
on x
NVIDIA Nemotron 3 Super is here to accelerate the era of agentic AI. Optimized for NVIDIA Blackwell, this 120B open model uses a hybrid Mixture-of-Experts (MoE) architecture that delivers 5x higher throughput and 2x higher accuracy. The model combines advanced reasoning with a
-
@_albertgu
Albert Gu
on x
as always, exciting to see NVIDIA continue to invest in Mamba hybrids and true open source. very impressive results!
-
@jiantaoj
Jiantao Jiao
on x
Nemotron 3 Super arrived! With efficiency in mind (Hybrid SSM Latent MoE, designed for Blackwell), the accuracy is also incredible. The most important aspect is scaling RL, utilizing the highly efficient and scalable Nemo Gym backend for RL environments and Nemo RL for model
-
@ctnzr
Bryan Catanzaro
on x
Announcing NVIDIA Nemotron 3 Super! 💚120B-12A Hybrid SSM Latent MoE, designed for Blackwell 💚36 on AAIndex v4 💚up to 2.2X faster than GPT-OSS-120B in FP4 💚Open data, open recipe, open weights Models, Tech report, etc. here: https://research.nvidia.com/ ... And yes, Ultra is comin…
-
@artificialanlys
@artificialanlys
on x
NVIDIA has released Nemotron 3 Super, a 120B (12B active) open weights reasoning model that scores 36 on the Artificial Analysis Intelligence Index with a hybrid Mamba-Transformer MoE architecture We were given access to this model ahead of launch and evaluated it across [image]
-
@cloudflaredev
@cloudflaredev
on x
Building multi-agent systems? @NVIDIA's Nemotron 3 Super (120B A12B) is now on Workers AI. - Reasoning and tool calling for complex multi-agent workflows - Built for code, finance, cybersecurity, and search agent use cases Learn more: https://developers.cloudflare.com/ ...
-
@nvidia
@nvidia
on x
New NVIDIA Nemotron 3 Super Delivers 5x Higher Throughput for Agentic AI
-
@natolambert
Nathan Lambert
on x
This looks like a model that's competitive with GPT OSS 120B or similar Qwen3.5 models on intelligence & speed, while coming with tons of open data + training details. Is a huge contribution for the ecosystem. Congrats Nvidia on the Nemotron 3 Super release!
-
@dr_alphalyrae
Vega Shah
on x
Today we launch NVIDIA's Nemotron Super 3, a 120B param open model designed to run agentic AI systems across scientific, enterprise and industrial applications. Partners working with us include Dassault Systèmes, Palantir Technologies, Lila Sciences and Edison Scientific Key [ima…
-
@nvidiaaidev
@nvidiaaidev
on x
🦞These innovations come together to create a model that is well suited for long-running autonomous agents. On PinchBench—a benchmark for evaluating LLMs as @OpenClaw coding agents—Nemotron 3 Super scores 85.6% across the full test suite, making it the best open model in its [imag…
-
@kimmonismus
@kimmonismus
on x
NVIDIA just dropped Nemotron 3 Super - and the architecture is wild. I was able to check it out early, and I love it (thanks, @nvidia) -120B parameters, but only 12B active. -A hybrid Mamba-Transformer MoE design that squeezes serious intelligence out of minimal compute. What [im…
-
@nvidiaaidev
@nvidiaaidev
on x
Introducing NVIDIA Nemotron 3 Super 🎉 Open 120B-parameter (12B active) hybrid Mamba-Transformer MoE model Native 1M-token context Built for compute-efficient, high-accuracy multi-agent applications Plus, fully open weights, datasets and recipes for easy customization and [video]
-
@benitoz
Ben Pouladian
on x
Nemotron 3 Super ships exactly what I mapped in December: Mamba hybrid, Latent MoE, multi-token prediction, NVFP4 on Blackwell 120B params, 12B active, 5x throughput Full-stack co-design, silicon to model No paywall👇🏽 https://bepresearch.substack.com/ ...
-
r/technology
r
on reddit
Nvidia Will Spend $26 Billion to Build Open-Weight AI Models, Filings Show
-
@jack
@jack
on x
this would be excellent
-
@miles_brundage
Miles Brundage
on x
I don't think there's a *super* strong reason to take this more seriously than Meta's earlier commitment to open source which was walked back, but a weak reason to think it's real is that NVIDIA benefits from model commoditization more than Meta did https://x.com/...