/
Navigation
Chronicles
Browse all articles
Explore
Semantic exploration
Research
Entity momentum
Nexus
Correlations & relationships
Story Arc
Topic evolution
Drift Map
Semantic trajectory animation
Posts
Analysis & commentary
Pulse API
Tech news intelligence API
Browse
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
Concept Search
Semantic similarity search
High Impact Stories
Top coverage by position
Sentiment Analysis
Positive/negative coverage
Anomaly Detection
Unusual coverage patterns
Analysis
Rivalry Report
Compare two entities head-to-head
Semantic Pivots
Narrative discontinuities
Crisis Response
Event recovery patterns
Connected
Search: /
Command: ⌘K
Embeddings: large
TEXXR

Chronicles

The story behind the story

days · browse · Enter similar · o open

The Allen Institute for AI releases Tulu 3 405B, an open source model that it claims outperforms DeepSeek V3 and OpenAI's GPT-4o on certain benchmarks

Move over, DeepSeek.  There's a new AI champion in town — and they're American.  —  On Thursday, Ai2, a nonprofit AI research institute based …

TechCrunch Kyle Wiggers

Discussion

  • @cuthrell.com Jay Cuthrell on bluesky
    As the newsworthy claims and benchmarks seasons get shorter it will be fascinating to see how @mlcommons.org grows
  • @natolambert Nathan Lambert on x
    Very happy to show that we can do RL finetuning on 405B models with open-source code, beat Llama 405B instruct with their base model, and beat DeepSeek V3 too. Enjoy building off this teams hard work. Here's Tulu 3 405B. A holiday present from @hamishivi, @vwxyzjn and team.
  • @vwxyzjn Costa Huang on x
    🎁 Happy New Year!!! We are bringing new RL curves as presents. This time, we went beeeeeg (405B). RLVR + MATH just worked: training and testing performance are still going up 😍 [image]
  • @hannahajishirzi Hanna Hajishirzi on x
    Excited to release our newest, largest, and best Tulu yet. Our RLVR recipe works at scale, outperforming Deepseek V3. So proud of the team! And @hamishivi @vwxyzjn for scaling up the Tulu recipe. [image]
  • @hamishivi Hamish Ivison on x
    li'l holiday project from the tulu team :) Scaling up the Tulu recipe to 405B works pretty well! We mainly see this as confirmation that open-instruct scales to large-scale training — more exciting and ambitious things to come! [image]
  • @tim_dettmers Tim Dettmers on x
    Beating DeepSeek-V3 with a 405B Llama base is not easy — solid post-training goes a long way. The nice thing is that it is fully open-source, so anyone can use this recipe for their base models.
  • @allen_ai @allen_ai on x
    Here is Tülu 3 405B 🐫 our open-source post-training model that surpasses the performance of DeepSeek-V3! The last member of the Tülu 3 family demonstrates that our recipe, which includes Reinforcement Learning from Verifiable Rewards (RVLR) scales to 405B - with performance on [i…