The Allen Institute for AI releases Tulu 3 405B, an open source model that it claims outperforms DeepSeek V3 and OpenAI's GPT-4o on certain benchmarks
Move over, DeepSeek. There's a new AI champion in town — and they're American. — On Thursday, Ai2, a nonprofit AI research institute based …
TechCrunch Kyle Wiggers
Related Coverage
- How DeepSeek ripped up the AI playbook—and why everyone's going to follow it MIT Technology Review · Will Douglas Heaven
- Scaling the Tülu 3 post-training recipes to surpass the performance of DeepSeek V3 Ai2
- Mistral AI Introduces Mistral Small 3, a High-Speed Open LLM to Compete with GPT-4o Mini WinBuzzer · Markus Kasanmascheff
- Ai2's new Tulu 3 model rivals tech giants in breakthrough for open-source AI post-training GeekWire · Todd Bishop
- Mistral, Ai2 release new open-source LLMs SiliconANGLE · Maria Deutscher
- Alibaba's New AI Model Clocks in to Pass DeepSeek and OpenAI Techstrong.ai · Jon Swartz
- Ai2 releases Tülu 3, a fully open-source model that bests DeepSeek v3, GPT-4o with novel post-training approach VentureBeat · Sean Michael Kerner
Discussion
-
@cuthrell.com
Jay Cuthrell
on bluesky
As the newsworthy claims and benchmarks seasons get shorter it will be fascinating to see how @mlcommons.org grows
-
@natolambert
Nathan Lambert
on x
Very happy to show that we can do RL finetuning on 405B models with open-source code, beat Llama 405B instruct with their base model, and beat DeepSeek V3 too. Enjoy building off this teams hard work. Here's Tulu 3 405B. A holiday present from @hamishivi, @vwxyzjn and team.
-
@vwxyzjn
Costa Huang
on x
🎁 Happy New Year!!! We are bringing new RL curves as presents. This time, we went beeeeeg (405B). RLVR + MATH just worked: training and testing performance are still going up 😍 [image]
-
@hannahajishirzi
Hanna Hajishirzi
on x
Excited to release our newest, largest, and best Tulu yet. Our RLVR recipe works at scale, outperforming Deepseek V3. So proud of the team! And @hamishivi @vwxyzjn for scaling up the Tulu recipe. [image]
-
@hamishivi
Hamish Ivison
on x
li'l holiday project from the tulu team :) Scaling up the Tulu recipe to 405B works pretty well! We mainly see this as confirmation that open-instruct scales to large-scale training — more exciting and ambitious things to come! [image]
-
@tim_dettmers
Tim Dettmers
on x
Beating DeepSeek-V3 with a 405B Llama base is not easy — solid post-training goes a long way. The nice thing is that it is fully open-source, so anyone can use this recipe for their base models.
-
@allen_ai
@allen_ai
on x
Here is Tülu 3 405B 🐫 our open-source post-training model that surpasses the performance of DeepSeek-V3! The last member of the Tülu 3 family demonstrates that our recipe, which includes Reinforcement Learning from Verifiable Rewards (RVLR) scales to 405B - with performance on [i…