The Allen Institute for AI releases Tulu 3 405B, an open source model that it claims outperforms DeepSeek V3 and OpenAI's GPT-4o on certain benchmarks

Move over, DeepSeek. There's a new AI champion in town — and they're American. — On Thursday, Ai2, a nonprofit AI research institute based …

TechCrunch 2025-01-31 Kyle Wiggers

Discussion

@cuthrell.com Jay Cuthrell on bluesky
As the newsworthy claims and benchmarks seasons get shorter it will be fascinating to see how @mlcommons.org grows
@natolambert Nathan Lambert on x
Very happy to show that we can do RL finetuning on 405B models with open-source code, beat Llama 405B instruct with their base model, and beat DeepSeek V3 too. Enjoy building off this teams hard work. Here's Tulu 3 405B. A holiday present from @hamishivi, @vwxyzjn and team.
@vwxyzjn Costa Huang on x
🎁 Happy New Year!!! We are bringing new RL curves as presents. This time, we went beeeeeg (405B). RLVR + MATH just worked: training and testing performance are still going up 😍 [image]
@hannahajishirzi Hanna Hajishirzi on x
Excited to release our newest, largest, and best Tulu yet. Our RLVR recipe works at scale, outperforming Deepseek V3. So proud of the team! And @hamishivi @vwxyzjn for scaling up the Tulu recipe. [image]
@hamishivi Hamish Ivison on x
li'l holiday project from the tulu team :) Scaling up the Tulu recipe to 405B works pretty well! We mainly see this as confirmation that open-instruct scales to large-scale training — more exciting and ambitious things to come! [image]
@tim_dettmers Tim Dettmers on x
Beating DeepSeek-V3 with a 405B Llama base is not easy — solid post-training goes a long way. The nice thing is that it is fully open-source, so anyone can use this recipe for their base models.
@allen_ai @allen_ai on x
Here is Tülu 3 405B 🐫 our open-source post-training model that surpasses the performance of DeepSeek-V3! The last member of the Tülu 3 family demonstrates that our recipe, which includes Reinforcement Learning from Verifiable Rewards (RVLR) scales to 405B - with performance on [i…

Chronicles

The Allen Institute for AI releases Tulu 3 405B, an open source model that it claims outperforms DeepSeek V3 and OpenAI's GPT-4o on certain benchmarks

Related Coverage

Discussion