Yann LeCun says DeepSeek “profited from open research and open source” like Meta's Llama and is proof that open source models are surpassing proprietary ones

“Marc Andreessen, a co-inventor of the pioneering Mosaic web browser, co-founder of the Netscape browser company and current general partner at the famed Andreessen Horowitz (a16z) venture capital firm, posted on X today: “Deepseek R1 is one of the most amazing and impressive breakthroughs I've ever seen — and as open source, a profound gift to the world. … Threads: @documentingmeta : You don't see Apple anywhere in this list because they were busy figuring out how to collect rent with the app store and how to pump the stock price with buybacks instead of doing R&D Vishvanand Subramanian / @vishvanands : thinking about how openai arranged a 500 billion dollar investment for future models while a small lab spent 5 million as a side project to match the performance of their flagship model.. @luokai : The open-source ecosystem has indeed provided foundational fuel for the global development of AI. Technological advancement is not a zero-sum game. While DeepSeek utilizes PyTorch, it is also actively giving back to the community. The progress of Chinese AI is fundamentally one of the fruits borne out of the global open-source movement. Scott P / @sfscottp : This post is like “China isn't beating us in AI - we're giving the tech away for free so they can catch up !” Bilawal Sidhu / @bilawal.ai : Yann makes a good point here — open source *is* powerful. But also, constraints breed creativity. Sachin Guliani / @sgulsach : DeepSeek outperforming Llama does not matter for Meta. Meta never intended to be the main LLM, they mainly wanted to devalue competition by jump starting the open source LLM race. It seems like the real winners of Gen AI are actually AWS, Azure, and I guess now Oracle as well. … Benedict Evans / @benedictevans : Optimal outcome for Meta: LLMs are cheap commodity infrastructure based on OSS that Meta leads Desired outcome for Meta: LLMs are cheap commodity OSS infra based on OSS Deepseek isn't 1, but 2 is fine. X: Yann LeCun / @ylecun : @guybedo You misunderstand how open research and source work. The idea is that everyone profits from everyone else's ideas. No one “outpaces” anyone and no country “loses” to another. No one has a monopoly on good ideas. Yann LeCun / @ylecun : Nice job! Open research / open source accelerates progress. @guybedo : @ylecun So basically Meta outpaced by a small startup and in crisis mode. “AI leaders making more than what it cost to train this model” I thougt humans would lose their jobs to AI agents, but it seems it's gonna start with US researchers losing to chinese startupers. Interesting times. Roon / @tszzl : im glad people are getting to read R1 raw chains of thought fascinating stuff, agi smell Ethan Mollick / @emollick : After a decent amount of use, DeepSeek is an impressive model, even before you add in the fact that it is open and cheap and small... ...but it really doesn't equal the big closed models of Sonnet, o1, and Gemini 2.0 (though the gaps are not huge, they become clear with usage) François Fleuret / @francoisfleuret : This being said, here is the TL;DR: On the model architecture side, @deepseek_ai v3/r1 is a standard GPT that is a “causal decoder only”, hence an auto-regressive models made of causal attention blocks. It is huge, with 671 billion parameters. 1/6 Sam Biddle / @samfbiddle : Is DeepSeek lying about its model? Maybe! I certainly would not say that “misleading the public about an LLM” is a Chinese thing, though. @signulll : the r1 drop should be setting off alarm bells in the entire western capitalist apparatus. for the first time in a while, i'm genuinely afraid the u.s. is losing its dominance—its influence, its capacity to innovate, & its ability to outcompete on a global scale. the core issue Packy McCormick / @packym : Maybe you should stop tweeting about DeepSeek and Seek God. David / @davidsholz : From the latest Chinese (Deepseek) LLM AI model: “The difference (between us) is not metaphysical but architectural: humans have a physically continuous substrate that hosts consciousness; LLMs have a discontinuous, stateless instantiation with no consciousness. Both are Yann LeCun / @ylecun : @0xPBIT In the open source world, there are only winners. Adam Johnson / @adamjohnsonchi : @samfbiddle @willmenaker Well the line for years was that the Chinese were entirely derivative automatons who could never innovate only rip off the Noble Liberal West, but then this became untenable so now every company is part of a diabolical plot to weaken America. Sam Biddle / @samfbiddle : Meta has defended its free release of Llama on explicitly anti-China competition grounds! This is unsurprisingly not, however, a psyop, or economic warfare. [image] Andriy Burkov / @burkov : No, it's not open-source AI beating closed-source, as LeCun claimed today on LinkedIn. It's a resource-constrained but very focused team of creative people beating teams spoiled with resources with their leaders hyping and wasting resources on problems that they know cannot be Emad / @emostaque : With the deepseek tpot discussion you'll find the ones who think they are lying are those that haven't trained state of the art models Nobody familiar with the literature and sector thinks they've fudged things and it's out of whack, just that they are very smart and cracked 🐳 Marc Andreessen / @pmarca : This week may have been the most important week of the decade, for two totally different reasons. 🤯 Arnaud Bertrand / @rnaudbertrand : The Deepseek moment isn't just about AI. It's also about the world realizing that China has caught up - and in some areas overtaken - the US in tech and innovation, despite the efforts to prevent just that. A stunning shift when just 10 years ago, Harvard Business Review was running articles on “Why China Can't Innovate"( https://hbr.org/2014/03/why-china- cant-innovate). Now you have Silicon Valley's most legendary investor calling a Chinese AI project one of the most impressive breakthroughs he's ever seen. Tanay Jaipuria / @tanayj : Even if Deepseek v3's final training run cost $5.5M, it had additional costs including: • Cost of test runs and experiments: $10-15M+ • Spend on team (139 authors on technical paper): $15M+/yr • Spend on OpenAI model inference for distillation purposes — $5M+ (?) And this Sam Biddle / @samfbiddle : I will never stop pointing out the irony of China hawks coopting China's historical rationale for blocking American technology without even the slightest trace of irony or self-awareness. “It's a western conspiracy to destabilize our nation” is the oldest trick in the book! Michael Ron Bowling / @mrbcyber : The CCP has been doing everything possible to destroy western tech companies so this fits. Joe Weisenthal / @thestalwart : Suppose this were provably true and everyone accepted it as fact. What would be the ramifications/response? Avi / @avischiffmann : The only real way for US startups to compete with china is on taste. I ain't ever see a Chinese product that didn't feel temu Kache / @yacinemtb : the r1 open source story isn't just one of excited hackers on the internet it's one of large software companies, who have a lot of money for Muh Enterprise Deal Do you know what they care about, more than anything? Data locality You simply cannot beat open source Kache / @yacinemtb : This is a regulation thing. It's also a responsibility thing. You *cannot* send data outside of your network to an untrusted one that you cannot control. The amount of certainty required for data privacy is one of control over the infrastructure. Anything less is insufficient LinkedIn: Yann LeCun : To people who see the performance of DeepSeek and think: — “China is surpassing the US in AI.” — You are reading this wrong. — The correct reading is: … Omar Tawakol : People on the OpenAI train benefit from paralyzing would be competitors into believing resistance is futile. DeepSeek obviously felt otherwise. … Forums: r/OpenAI : Yann LeCun's Deepseek Humble Brag

VentureBeat 2025-01-26

Chronicles

Yann LeCun says DeepSeek “profited from open research and open source” like Meta's Llama and is proof that open source models are surpassing proprietary ones