How DeepSeek outpaced OpenAI at a fraction of the cost: open source, pure reinforcement learning, no supervised fine-tuning, and building on DeepSeek-R1-Zero
DeepSeek R1's Monday release has sent shockwaves through the AI community, disrupting assumptions about what's required to achieve cutting-edge AI performance.
VentureBeat Matt Marshall
Related Coverage
- Chinese AI chatbot DeepSeek sparks market turmoil BBC
- Why Is DeepSeek Sinking Nvidia Stock? Forbes
- China's DeepSeek AI rattles global tech markets Capital Brief
- A new AI assistant from China has Silicon Valley talking NBC News
- How the buzz around Chinese AI model DeepSeek sparked a massive Nasdaq sell-off CNBC
- China's AI Startup DeepSeek Will Make Advertisers Rethink How They Spend MediaPost
- Chinese AI startup DeepSeek is threatening Nvidia's AI dominance Fortune
- China's DeepSeek just dropped a free challenger to OpenAI's o1 - here's how to use it on your PC The Register
- 'AI's Sputnik moment': China-based DeepSeek's open-source models may be a real threat to the dominance of OpenAI, Meta, and Nvidia PC Gamer
- DeepSeek-R1: The Open-Source AI Challenging ChatGPT Search Engine Journal
- The AI Cost Curve Just Collapsed Again Tomasz Tunguz
- Chip Stocks Tumble After China's DeepSeek AI Models Raise Doubts Over U.S. Tech Dominance Wall Street Journal
- Why DeepSeek is hitting tech stocks hard, including Nvidia's Mashable
- Why is DeepSeek AI suddenly so popular? BGR
- DeepSeek unveils new R1 open-source AI model Verdict
- What is DeepSeek and how is it different to other AI models? The Times
- DeepSeek R1 is wildly overhyped, but you can try it at home Pivot to AI
- DeepSeek R1 is the Chinese AI model disrupting OpenAI and Anthropic — what you need to know Tom's Guide
- How to run DeepSeek's AI model on PC for free NewsBytes
- AI startup DeepSeek rivals OpenAI models using far fewer resources, shocks AI industry TechSpot
- China's DeepSeek AI Moves the Capital of Tech from Palo Alto to Hangzhou LewRockwell
- Open-Source DeepSeek R1 LLM Matches OpenAI's o1 for a Fraction of the Cost TechEBlog
Discussion
-
@deedydas
Deedy
on x
DeepSeek R1 isn't just “25x cheaper than GPT o1”... It is better than the unreleased OpenAI o3 at the same cost at coding on Codeforces and ARC-AGI! [image]
-
@natfriedman
Nat Friedman
on x
The deepseek team is obviously really good. China is full of talented engineers. Every other take is cope. Sorry.
-
@morganb
Morgan Brown
on x
7/ The results are mind-blowing: - Training cost: $100M → $5M - GPUs needed: 100,000 → 2,000 - API costs: 95% cheaper - Can run on gaming GPUs instead of data center hardware
-
@0xkarmatic
Karma
on x
The visible chains of thought in DeepSeek r1 makes it so easy to prompt it as you can clearly tell when your instructions were ambiguous. Missed opportunity from OpenAI to make their COTs visible. Now that the genie is out of the lamp and we have a reproduction of an o1-like
-
@morganb
Morgan Brown
on x
🧵 Finally had a chance to dig into DeepSeek's r1... Let me break down why DeepSeek's AI innovations are blowing people's minds (and possibly threatening Nvidia's $2T market cap) in simple terms...
-
@morganb
Morgan Brown
on x
2/ DeepSeek just showed up and said “LOL what if we did this for $5M instead?” And they didn't just talk - they actually DID it. Their models match or beat GPT-4 and Claude on many tasks. The AI world is (as my teenagers say) shook.
-
@itsolelehmann
Ole Lehmann
on x
DeepSeek is a 100x more based name than ChatGpt or Claude
-
@emollick
Ethan Mollick
on x
I think the market will adjust to any per token cost decrease brought on by DeepSeek quite quickly. Costs for GPT-4 level intelligence dropped by 1000x in the last 18 months. A 95% price drop in reasoning models seems not to be something that will break the labs.
-
@beeple
@beeple
on x
DEEPSEEK v. OPENAI [image]
-
@ananayarora
@ananayarora
on x
DeepSeek has had a private proxy to OpenAI atleast until 2024-08-10. The existence of this hints that they probably didn't pay the regular API pricing to OpenAI and used a fleet of bots to query chatGPT instead, during training [image]
-
@morganb
Morgan Brown
on x
8/ “But wait,” you might say, “there must be a catch!” That's the wild part - it's all open source. Anyone can check their work. The code is public. The technical papers explain everything. It's not magic, just incredibly clever engineering.
-
@teknium1
@teknium1
on x
Its crazy deepseek direct api has seemingly no rate limits of any kind
-
@emostaque
Emad
on x
Simpler way to understand DeepSeek weren't lying about 50k H100s or training costs for V3/R1 We have the model, its a 35b active, 640b Mixture of Experts We know that spec is 2-3m hours to train Models get worse with more compute after a certain point! https://www.harmdevries.com…
-
@arithmoquine
Henry
on x
i've made over 200,000 requests to the deepseek api in the last few hours. zero ratelimiting, and the whole thing cost me like 50 cents. bless the CCP, openai could never
-
@snowmaker
Jared Friedman
on x
Lots of hot takes on whether it's possible that DeepSeek made training 45x more efficient, but @doodlestein wrote a very clear explanation of how they did it. Once someone breaks it down, it's not hard to understand. Rough summary: * Use 8 bit instead of 32 bit floating point
-
@rakyll
Jaana Dogan
on x
DeepSeek codebases are clean and well authored. I learned a lot by reading their work just over the weekend. You cannot deny that they are raising the bar, and wish we focus on quality instead of short sighted incremental work.
-
@morganb
Morgan Brown
on x
6/ Traditional models? All 1.8 trillion parameters active ALL THE TIME. DeepSeek? 671B total but only 37B active at once. It's like having a huge team but only calling in the experts you actually need for each task.
-
@theshortbear
@theshortbear
on x
DeepSeek seems to have created a panic moment within the biggest companies and it should alarm investors. Costs: 2,048 Nvidia H800 GPUs: $40-50 million · Training: $5 million If all it takes to beat OpenAI is a maximum of $55 million, the industry is becoming commoditized way f…
-
@firstadopter
Tae Kim
on x
It's silly town on here right now as engagement farmers compare apples and oranges but I'll just cite Bernstein (Bernstein is right): “Did DeepSeek really build OpenAI for $5 million? Of course not” “a fundamental misunderstanding over the “$5M” number” “categorically false”
-
@pitdesi
Sheel Mohnot
on x
So many viral tweets comparing OpenAI's $6.6B raised to <$10M from DeepSeek 🤦🏽♂️ People are so dumb.
-
@morganb
Morgan Brown
on x
3/ How? They rethought everything from the ground up. Traditional AI is like writing every number with 32 decimal places. DeepSeek was like “what if we just used 8? It's still accurate enough!” Boom - 75% less memory needed.
-
@wordgrammer
@wordgrammer
on x
Okay. Thanks for the nerd snipe guys. I spent the day learning exactly how DeepSeek trained at 1/30 the price, instead of working on my pitch deck. The tl;dr to everything, according to their papers:
-
@morganb
Morgan Brown
on x
1/ First, some context: Right now, training top AI models is INSANELY expensive. OpenAI, Anthropic, etc. spend $100M+ just on compute. They need massive data centers with thousands of $40K GPUs. It's like needing a whole power plant to run a factory.