Industry insiders say DeepSeek's focus on research makes it a dangerous competitor as it's willing to share breakthroughs rather than protect them for profits

China is pulling the same trick. — www.ft.com/content/747a... Mastodon: Brian Kung / @briankung@hachyderm.io : “There's a pretty delicious, or maybe disconcerting irony to this, given OpenAI's founding goals to democratize AI for the masses. As Nvidia senior research manager Jim Fan put it on X: “We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive — truly open, frontier research that empowers all. … John Carlos Baez / @johncarlosbaez@mathstodon.xyz : Quotes from today's — “A small Chinese artificial intelligence lab stunned the world this week by revealing the technical recipe for its cutting-edge model, turning its reclusive leader into a national hero who has defied US attempts to stop China's high-tech ambitions. … X: Marc Andreessen / @pmarca : Deepseek R1 is one of the most amazing and impressive breakthroughs I've ever seen — and as open source, a profound gift to the world. 🤖🫡 @jjitsev : (Yet) another tale of Rise and Fall: DeepSeek R1 is claimed to match o1/o1-preview on olympiad level math & coding problems. Can it handle versions of AIW problems that reveal generalization & basic reasoning deficits in SOTA LLMs? ( https://arxiv.org/...) 🧵1/n [image] Neal Khosla / @nealkhosla : deepseek is a ccp state psyop + economic warfare to make american ai unprofitable they are faking the cost was low to justify setting price low and hoping everyone switches to it damage AI competitiveness in the us dont take the bait Alexander Doria / @dorialexander : So DeepSeek situation summarized: *They are not a small engineer team but one of the leading frontier lab (+100 researchers full time). *They are not a newcomer. Started in 2023 by retraining a llama, then slowly rising to the top. All documented in their 16 (!) papers. Gary Marcus / @garymarcus : Important, smart thread on DeepSeek R1 and generalization. @kimmonismus : Billionaire and Scale AI CEO Alexandr Wang: DeepSeek has about 50,000 NVIDIA H100s that they can't talk about because of the US export controls that are in place. [video] Alexander Doria / @dorialexander : Ah seeing multiple critics for stating that 2023 born DeepSeek is not a “newcomer”. I'm sorry if you are not aware of how fast theses things go, you're not really into LLM research. Josh / @joshc0301 : Apologists for capitalism say that capitalism “incentivizes innovation” when research is best done collaboratively for the goal of advancing humanity When the primary goal for research is profit, the default is to avoid sharing info to competitors, which slows innovation Alexander Doria / @dorialexander : The wave of R1 reproduction is just starting and impressive results already. This will be a massive boon for small models (even more so with some level of specialization). Qian Liu / @sivil_taram : 🚀 After 5 days of DeepSeek-R1, we've replicated its pure reinforcement learning magic on math reasoning — no reward models, no supervised fine-tuning, from a base model — and the results are mind-blowing: 🧠 A 7B model + 8K MATH examples for verification + Reinforcement @suspendedrobot : OpenAI stole from the whole internet to make itself richer, DeepSeek stole from them and give it back to the masses for free I think there is a certain british folktale about this Philip Pilkington / @philippilk : The responses to this have been nuts. There are apparently a ton of AI hype-beasts on Twitter who have literally no idea what to do now that a few underfunded Chinese guys blew up their collective grift. Their response is now that DeepSeek is some sort of CCCP psyop. 🥴😵‍💫 Peter Yang / @petergyang : I find DeepSeek's thinking output more fascinating than its actual output Paul Couvert / @itspaulai : No need to pay $200 to use Operator You can create an agent that uses a web browser without writing a line of code. Combine DeepSeek R1 and Browser Use (free and open source) and you're good to go. (Links and prompt below) [video] Niels Rogge / @nielsrogge : “So there's this Chinese company called DeepSeek which basically does what OpenAI initially intended to do. They open-sourced a model trained with large-scale reinforcement learning, beating everyone else, and even releasing a paper detailing their process” [image] Soumith Chintala / @soumithchintala : i'm comically impressed that people are coping on deepseek by spewing bizarre conspiracy theories — despite deepseek open-sourcing and writing some of the most detail oriented papers ever. read. replicate. compete. don't be salty, just makes you look incompetent. Kang / @jaycaspiankang : This DeepSeek story is hilarious. Greatest troll job in years. They just tweeted out the secrets and now what's gonna happen to NVIDIA and open ai? I guess we can still use it to make funny college football memes. Le Shrub / @agnostoxxx : Everyone is accusing Deepseek of either being a fraud or a CCP Trojan Horse. Meanwhile, Sam Altman is openly vying to become your new AI Overlord and y'all cool with it 🥹 [image] Greg J Stoker / @gregjstoker : The meltdown on X by American techno-feudalists has to do with the Chinese AI model “Deepseek” being far more efficient and cheaper than anything coming out of Silicon Valley. Technical leadership v.s the American business-led model designed to absorb the maximum amount of [image] @yishan : R1 isn't the true big disruption. It's just a herald. Here's what I predict will happen: Within 3-6 months, an American company will duplicate the same cost-savings in a similar model. DS literally told everyone how to do it. After that, Deepseek will come out with something so Archie Sengupta / @archiexzzz : DeepSeek Engineers [image] Nathan Benaich / @nathanbenaich : hot take: deepseek is a blessing for ai startups that can now rip out their costly americano models and have profitable unit economics their investors should be happy, happy as a hippo [image] @suhail : Looks like DeepSeek just literally did it more efficiently. Game recognize game. [image] Vrushank Desai / @vrushankdes : i find it hilarious that Deepseek released pages of details about their parallelism/quantization/etc to stave off haters doubting their training efficiency and.....haters still can't believe how efficient their model is lmaoo [image] Amjad Masad / @amasad : So much cope about DeepSeek. Not only did they release a great model. they also released a breakthrough training method (R1 Zero) that's already reproducing. I doubt they lied about training costs, but even if they did they're still awesome for this great gift to the world. @iamgingertrash : Sam spent more on this than Deepseek did to train the model that killed OpenAI [image] @iamgingertrash : This guy is a product of nepotism and his dad is balls-deep in OpenAI stock Can you smell the cope? Here's his next tweet: “It's patriotic to use OpenAI and you're a communist if you use open source models like Deepseek” @nxthompson : The anger over Deepseek training on ChatGPT's outputs, in violation of TOS, is justified. But it might perhaps also make the AI companies think about the training they did on copyrighted content without consent. https://techcrunch.com/... Nat Friedman / @natfriedman : The deepseek team is obviously really good. China is full of talented engineers. Every other take is cope. Sorry. @hardmaru : DeepSeek is a side project 🔥 [image] Han Xiao / @hxiao : @abacaj deepseek's holding 幻方量化 is a quant company, many years already,super smart guys with top math background; happened to own a lot GPU for trading/mining purpose, and deepseek is their side project for squeezing those gpus Michael Kove / @michael_kove : DeepSeek stole the AI thunder: - with zero hype from CEO, - zero “omg guys it changez everythin” influencers - no swanky demos - no bloated promises - no hints at “AGI achieved internally” They did it by shipping an actual product. [image] @signulll : kind of hilarious how deepseek is exactly what chatgpt was supposed to be—what openai originally promised—before sam pivoted hard toward profit. china going open source on ai is a wild plot twist—basically throws out every argument @vkhosla & others made. this is like watching Shubham Saboo / @saboo_shubham_ : DeepSeek R1 is 100% Opensource and 96.4% cheaper than OpenAI o1 while delivering similar performance. OpenAI o1: $60.00 per 1M output tokens DeepSeek R1: $2.19 per 1M output tokens People with $200 ChatGPT subscription, let that sink in. [video] Cristian Garcia / @cgarciae88 : “deepseek is just a hack” “they trained on o1” the cope is unreal Philip Pilkington / @philippilk : Maybe, hear me out here, AI was massively overhyped because NVIDIA is one the last remaining viable American hardware companies and Deepseek is just exposing the whole sector as a giant bubble full of capital misallocation and overinvestment. 🤔 Edward Ongweso Jr / @bigblackjacobin : whether or not deepseek is something, it is funny to see an industry that only exists cause it sucks at the state's teat 24/7 freak out. Silicon Valley only exists because we've committed to misallocation and overinvestment/overvaluation as an industrial policy for Some Reason Finbarr / @finbarrtimbers : The DeepSeek papers are remarkable in their level of detail. For instance, DeepSeekMath, which introduced GRPO, goes into reproducible detail on how they created their math corpus from Common Crawl. [image] Michael P. Frank / @mikepfrank : Since R1 came out, people are talking like the massive compute farms deployed by Western labs are a waste, BUT THEY'RE NOT — don't you see? This just means that once the best of DeepSeek's clever cocktail of new methods are adopted by GPU-rich orgs, they'll reach ASI even faster. Dave Troy / @davetroy : On the one hand, Deepseek is going to prove deeply disruptive to the Silicon Valley ecosystem; but on the other hand it's better we call bullshit on this hype cycle sooner than later, even if it took the CCP to do it. Sam Altman is Elizabeth Holmes 2.0. Aaron Ng / @localghost : Here's Deepseek r1 1.5B thinking through a problem — it's comparable to 4o and Claude 3.5 Sonnet in a number of domains like math. Except... it's a 1.5B model... and can run on virtually any hardware. Truly a huge efficiency leap. [video] @signulll : i've been running deepseek locally (i have a highest end mac studio) for few days, & it's absolutely on par with o1 or sonnet. i've been using it nonstop for coding and other tasks, & what would've cost me a fortune through api's is now completely free. this feels like a total @orikron : Deepseek had a budget of $5 million and it beat Open AI's model. Open AI has a budget of $5 billion. That's 1000x ROI improvement. The US cannot build a technological moat when it's competing against far superior brains. Alex Kantrowitz / @kantrowitz : Serious question, if DeepSeek is this good, what happens to the companies spending billions building models with slightly better performance? [image] Jen Zhu / @jenzhuscott : This is actually DeepSeek's culture - give credits to the team and the CEO is invisible. The exact opposite to some. Ben Hunt / @epsilontheory : LLMs are operating systems. Open source, locally running, high performance LLMs like DeepSeek R1 are the new Linux and will have the same impact on competitive AI landscape. Haider / @slow_developer : FACT it's funny how a chinese company (DeepSeek) forces the US company (openAI) to kneel down and offer their latest model, o3-mini, for free to users agree? [image] Ben Hunt / @epsilontheory : I think it's more likely that the US govt tries to ban DeepSeek R1 than TikTok. [image] @tynervp : BREAKING: Deepseek rumoured to be training r2 on a previously unknown chip powered purely on american cope. Joe Weisenthal / @thestalwart : So funny everyone suddenly realizing that DeepSeek is legit. I've been paying attention to them since Tuesday. Henry Shevlin / @dioscuri : Deepseek R1 is ridiculously good. Better than any LLM I've ever used so far, and notably extremely low rates of hallucinations, even on questions that are designed to elicit them. Shaun Rein / @shaunrein : In 1 week, from DeepSeek to Red Note, Chinese tech companies have shattered the self-confidence of Silicon Valley Techbros at OpenAI, Facebook, Google wetting their pants I predicted this would happen in my 2014 book The End of Copycat China but SV & media attacked me as a @tsarnick : Perplexity CEO Aravind Srinivas says restrictions on chip imports to China are forcing them to innovate efficient solutions to AI model training, with DeepSeek trained on only 2048 H800 GPUs, making them the equivalent of DOGE for AI [video] Henry / @arithmoquine : i've made over 200,000 requests to the deepseek api in the last few hours. zero ratelimiting, and the whole thing cost me like 50 cents. bless the CCP, openai could never Arnaud Bertrand / @rnaudbertrand : That's actually a fantastic illustration why Western media's bias on China actually hurts the West more than it does China. Deepseek is only “startling” if you based your understanding on China off reporting by the likes of The Economist, who keep picturing a China as a @pastynome : DeepSeek is another example of China rapidly commoditizing new high tech industries and preventing the West from making excess profits or “rent” as it's called. If this keeps going, the standard of living in the West will fall. Emad / @emostaque : Deepseek have h800s which are h100s with reduced interconnect which is why they had to come up with lots of optimisations per their latest two papers No export controls on those so nothing to hide (@dylan522p the 50k is your guess?), my guess is they have 10-20k Deedy / @deedydas : DeepSeek built a high performance computer on Aug 31 with 10,000 A100 GPUs. A must-read paper only cited ONCE. In their V3 paper, the base for R1, they say they train on 2048 H800s, the export-controlled H100 with 50% the transfer rate. Why didn't they use the 10,000 A100s? [image] @thesiriusreport : Deepseek R1 has made many in the West finally question why everything we produce technologically costs ridiculous amounts of money to do so. Christian H. Cooper / @christiancooper : I asked #R1 to visually explain to me the Pythagorean theorem. This was done in one shot with no errors in less than 30 seconds. Wrap it up, its over: #DeepSeek #R1 [video] Sean Padraig McCarthy / @seanmccarthycom : Many reasons why China is winning and will win the tech race but a big one is university is affordable in China and it's now a debt slavery scheme here. They have access to their full human capital while in the US only upper class and above get the chance to create next DeepSeek Matt Clifford / @matthewclifford : Excellent take. While DeepSeek and the team behind R1 are super impressive, there's a huge amount of over correction going on... Ethan Mollick / @emollick : No matter how much you fight it, I find that the visible chain-of-thought from DeepSeek makes it nearly impossible to avoid anthropomorphizing the thing. The visible first-person “thinking” makes you feel like you are reading a diary of a somewhat tortured soul who wants to help [image] @silverspookguy : Thank you Chinese Deepseek devs for proving GenAI is a giant scam inflated by capitalists and is actually worth less than $5.5 million. [image] Ana Mostarac / @anammostarac : “Deepseek is a ccp state psyop + economic warfare to make american ai unprofitable” is an embarrassingly low agency, defeatist take. Malcolm Harris / @bigmeaninternet : Is one of the reasons DeepSeek took such a leap that they embraced technical leadership, vs. American industry which seems mostly business led and designed to absorb a maximum of investment capital? Prakash / @8teapi : Deepseek is not a “side project”. At the same time employees are not lying when they say it is. The story they are telling is myth making in the same vein in the Silicon Valley “we want to make the world a better place” but at the same time make billions of dollars. The team [video] Dean W. Ball / @deanwball : The amount of factually incorrect information and hyperventilating takes on deepseek on this website is truly astounding. I assumed that an object-level analysis was unnecessary but apparently I was wrong. Here you go: 1. DeepSeek is an extremely talented team and has been Matt Bruenig / @mattbruenig : Maybe trying to keep China from importing technology and thereby forcing them to innovate it better themselves is making them stronger, as with deepseek. Funny situation Philip Pilkington / @philippilk : How many examples do we need of export restrictions and sanctions on China driving innovation before policymakers in DC get it through their skulls? Really, it's getting embarassing at this stage. DeepSeek is only the latest leapfrog in part created by these restrictions. 🇨🇳🇺🇸 [image] @_its_not_real_ : Registering prediction: DeepSeek is over fitted to the most common 70% of use cases, trained on synthetic ChatGPT data, and the actual budget given is BS. There is nothing novel to it and once Meta et Al tear into it this will be apparent. Kate Willett / @katewillett : China releasing Deepseek a day after this is hilarious. Jack Morris / @jxmnop : i guess DeepSeek broke the proverbial four-minute-mile barrier. people used to think this was impossible. and suddenly, RL on language models just works and it reproduces on a small-enough scale that a PhD student can reimplement it in only a few days this year is going to Jacob Silverman / @silvermanjacob : DeepSeek just smoked OpenAI by being more innovative and far more efficient, not stealing tech Jordi Hays / @jordihays : What TikTok is to 22 year old girls, DeepSeek is to developers. Artificial views and likes are just replaced with artificially low inference costs. @hesamation : Respect! Hugging Face 🤗 is reproducing the whole DeepSeek R1 pipeline to be used by open source community. So far it has > GRPO implementation > train and evaluation code > generator for synthetic data [image] Tim Hwang / @timhwang : Rumors out today that Deepseek is being aided by a secretive global terrorist organization known only by its sinister alias: Meta Joe Weisenthal / @thestalwart : I wrote about how easily and quickly I was able to switch from using ChatGPT to using DeepSeek for my random day-to-day AI queries https://www.bloomberg.com/... Rush Doshi / @rushdoshi : Very useful context on DeepSeek. Did they really accomplish all that on just 5 million? Seems not. Probably more like $1 billion. @andr3jh : OpenAI engineers browsing the “our team” section of the DeepSeek website and recognizing their ex-girlfriends [image] Darshan Sanghrajka / @chiefchimpanzee : It's pretty amazing that a bunch of quants at a hedge fund in China made Deepseek for $6m and did it as a side project 🤣 Every Western AI bro that has burned BILLIONS just got sideswiped by them. No wonder Sam Altman is out there with red eyes chatting utter nonsense. Minh Nhat Nguyen / @menhguin : R1 model aside, did anyone notice that Deepseek app has multimodality, PDF upload and search, which not even O1 pro has rn? [image] Emad / @emostaque : This is the crazy thing about conspiracy theories about @deepseek_ai, they open source their models have have fantastic detailed papers! Everyone who paid attention has known how great their work has been and aren't hugely surprised by R1 🐳 @citrini7 : deepseek being the brainchild of a chinese hedge fund is such a complete total cultural victory by capitalism that, ideologically at least, it should soften the blow. Philip Pilkington / @philippilk : This is what makes the DeepSeek thing so funny. A bunch of grifters have been selling AI secret sauce for years - spooky mystery juice that could never be fully explained. Now a bunch of young guys just wrote a good algo, published it, and the circus tent burned down. 🤖 @qcapital2020 : So wait wait wait , the founder of DeepSeek is basically the Jim Simons of China and was doing this LLM thing only as a side project and for $6M was able to dethrone every AI company in the world. We are so cooked LOL [image] Adam / @adamemedia : China has created one of the world's best AI models for only $6 million, as opposed to the billions spent by Facebook, Google, Microsoft etc. And DeepSeek is open-sourced, while the US models are proprietary and secretive—exposing the West's bloated, profit-driven approach to [image] Arnaud Bertrand / @rnaudbertrand : All these posts about Deepseek “censorship” just completely miss the point: Deepseek is Open Source under MIT license which means anyone is allowed to download the model and fine-tune it however they want. Which means that if you wanted to use it to make a model whose purpose is [image] Q. Anthony Ali / @nobleqali : The more I read these responses to China dropping DeepSeek into the public domain, the more I realize this was a deliberate attack on the tech oligopolists. They're freaking out because they've been exposed as overpaid hacks @hamptonism : > be an electric engineering student > team up w/ cracked classmates > start quant trading *we're so cracked* > founded a quant firm in his 30's > makes ¥100B trading with ai/ml *we're even more cracked with ai* > buys thousands of Nvdia GPUs > creates DeepSeek as a side project [image] @basedbeffjezos : The last thing your AI lab sees before getting one-shotted by a Chinese bootleg LLM team with a $5M cluster [image] Arnaud Bertrand / @rnaudbertrand : All benchmarks now confirm it: Deepseek is truly is as good as OpenAI's o1 (which is top of the range) for 3% of the price. Boom. And that's when you want to pay for the API. You can also use it Open Source for “free” (which you can't do with o1). There's no overstating how [image] Emad / @emostaque : The furore over DeepSeep R1 costing $5.5m buried the actual lede.. It didn't. That's how much the base v3 model cost (GPT 4o level). R1 likely cost O($100k), but the real buried lede is the distilled versions of Qwen & Llama which cost O($10k) to tune & can still improve.. Anjney Midha / @anjneymidha : From Stanford to MIT, deepseek r1 has become the model of choice for America's top university researchers basically overnight Kevin Roose / @kevinroose : It's sort of funny that every American tech company is bragging about how much money they're spending to build their models, and DeepSeek is just like “yeah we got there with $47 and a refurbished Chromebook” Samuel Hammond / @hamandcheese : This is wrong on several levels. - DeepSeek trains on h100s. Their success reveals the need to invest in export control *enforcement* capacity. - CoT / inference-time techniques make access to large amounts of compute *more* relevant, not less, given the trillions of tokens Emad / @emostaque : Deepseek are not faking the cost of the run. It's pretty much in line with what you'd expect given the data, structure, active parameters and other elements and other models trained by other people You can run it independently at the same cost It's a good lab working hard 😓 Holger Zschaepitz / @schuldensuehner : China's #DeepSeek could represent the biggest threat to US equity markets as the company seems to have built a groundbreaking AI model at an extremely low price and w/o having access to cutting-edge chips, calling into question the utility of the hundreds of billions worth of [image] Nabeel S. Qureshi / @nabeelqu : Everyone is way overindexing on the $5.5m final training run number from DeepSeek. - GPU capex probably $1BN+ - Running costs are probably $X00M+/year - ~150 top-tier authors on the v3 technical paper, $50m+/year They're not some ragtag outfit, this was a huge operation. @avichal : What's more likely? 1 - small group of AI engineers at @deepseek_ai figures out how to beat all of the top researchers in the world as a side project 2 - Chinese government has 100k GPUs they shouldn't have and releases open source models claiming $6m training cost as a psyop Kache / @yacinemtb : Closed Source AI is DEAD Do you know how much lawyers you need to employ to even justify using openai's API for some internal use case? It would take your lawyers 6 months of lead time to sort it all out. And by then, deepseek and qwen will have released 2 new models Tae Kim / @firstadopter : The probability it cost DeepSeek $6 million of spending (R&D) to create their models is ZERO if you actually read the paper but go ahead with the sensationalist narratives Sam Biddle / @samfbiddle : Interesting how often when something Chinese outperforms something American (TikTok, deepseek, cars, etc) it's a “psyop” or “economic warfare” or “digital fentanyl” or a “cyberweapon” and not just “another country made something people like” @geiger_capital : Ok. Deepseek is just as good, if not better, than OpenAI and costs 3% of the price... It took them 2 months and less than $6 million to build, using reduced-capability chips, while US companies are pouring in hundreds of BILLIONS. So... what happens to the Nasdaq? LinkedIn: Brittain Ladd : IS OPENAI THE NEXT SEARS? — I find what's taking place in AI to be quite fascinating and concerning. … June Odongo : In the early 2000s, in the computer science department at UMass Lowell, I recall noticing in one particular year, when I started to take some graduate classes … Mark Tluszcz : When innovators get disrupted by copycats...DeepSeek AI significantly cheaper than existing LLMs like OpenAI - the race to zero value for LLMs has accelerated. … Guan Seng Khoo, PhD : “Liang's status as an outsider in the AI field was an unexpected source of strength. At High-Flyer, he built a fortune by using AI and algorithms to identify patterns that could affect stock prices. … Forums: r/technology : How China's new AI model DeepSeek is threatening U.S. dominance r/geopolitics : How China's new AI model DeepSeek is threatening U.S. dominance r/LeopardsAteMyFace : The people that wanted to replace us with AI got replaced by AI r/economy : How China's AI model DeepSeek is threatening U.S. dominance. (CNBC)

Financial Times 2025-01-26

Discussion

@icooper Ian Cooper on bluesky
The story goes that British software devs in gaming (and SFX artists) became world beaters, allowing the UK to bat outside its league, because they had to practice their craft operating with far fewer resources than their US counterparts. — China is pulling the same trick. — …
@pmarca Marc Andreessen on x
Deepseek R1 is one of the most amazing and impressive breakthroughs I've ever seen — and as open source, a profound gift to the world. 🤖🫡
@jjitsev @jjitsev on x
(Yet) another tale of Rise and Fall: DeepSeek R1 is claimed to match o1/o1-preview on olympiad level math & coding problems. Can it handle versions of AIW problems that reveal generalization & basic reasoning deficits in SOTA LLMs? ( https://arxiv.org/...) 🧵1/n [image]
@nealkhosla Neal Khosla on x
deepseek is a ccp state psyop + economic warfare to make american ai unprofitable they are faking the cost was low to justify setting price low and hoping everyone switches to it damage AI competitiveness in the us dont take the bait
@dorialexander Alexander Doria on x
So DeepSeek situation summarized: *They are not a small engineer team but one of the leading frontier lab (+100 researchers full time). *They are not a newcomer. Started in 2023 by retraining a llama, then slowly rising to the top. All documented in their 16 (!) papers.
@garymarcus Gary Marcus on x
Important, smart thread on DeepSeek R1 and generalization.
@kimmonismus @kimmonismus on x
Billionaire and Scale AI CEO Alexandr Wang: DeepSeek has about 50,000 NVIDIA H100s that they can't talk about because of the US export controls that are in place. [video]
@dorialexander Alexander Doria on x
Ah seeing multiple critics for stating that 2023 born DeepSeek is not a “newcomer”. I'm sorry if you are not aware of how fast theses things go, you're not really into LLM research.
@joshc0301 Josh on x
Apologists for capitalism say that capitalism “incentivizes innovation” when research is best done collaboratively for the goal of advancing humanity When the primary goal for research is profit, the default is to avoid sharing info to competitors, which slows innovation
@dorialexander Alexander Doria on x
The wave of R1 reproduction is just starting and impressive results already. This will be a massive boon for small models (even more so with some level of specialization).
@sivil_taram Qian Liu on x
🚀 After 5 days of DeepSeek-R1, we've replicated its pure reinforcement learning magic on math reasoning — no reward models, no supervised fine-tuning, from a base model — and the results are mind-blowing: 🧠 A 7B model + 8K MATH examples for verification + Reinforcement
@suspendedrobot @suspendedrobot on x
OpenAI stole from the whole internet to make itself richer, DeepSeek stole from them and give it back to the masses for free I think there is a certain british folktale about this
@philippilk Philip Pilkington on x
The responses to this have been nuts. There are apparently a ton of AI hype-beasts on Twitter who have literally no idea what to do now that a few underfunded Chinese guys blew up their collective grift. Their response is now that DeepSeek is some sort of CCCP psyop. 🥴😵‍💫
@petergyang Peter Yang on x
I find DeepSeek's thinking output more fascinating than its actual output
@itspaulai Paul Couvert on x
No need to pay $200 to use Operator You can create an agent that uses a web browser without writing a line of code. Combine DeepSeek R1 and Browser Use (free and open source) and you're good to go. (Links and prompt below) [video]
@nielsrogge Niels Rogge on x
“So there's this Chinese company called DeepSeek which basically does what OpenAI initially intended to do. They open-sourced a model trained with large-scale reinforcement learning, beating everyone else, and even releasing a paper detailing their process” [image]
@soumithchintala Soumith Chintala on x
i'm comically impressed that people are coping on deepseek by spewing bizarre conspiracy theories — despite deepseek open-sourcing and writing some of the most detail oriented papers ever. read. replicate. compete. don't be salty, just makes you look incompetent.
@jaycaspiankang Kang on x
This DeepSeek story is hilarious. Greatest troll job in years. They just tweeted out the secrets and now what's gonna happen to NVIDIA and open ai? I guess we can still use it to make funny college football memes.
@agnostoxxx Le Shrub on x
Everyone is accusing Deepseek of either being a fraud or a CCP Trojan Horse. Meanwhile, Sam Altman is openly vying to become your new AI Overlord and y'all cool with it 🥹 [image]
@gregjstoker Greg J Stoker on x
The meltdown on X by American techno-feudalists has to do with the Chinese AI model “Deepseek” being far more efficient and cheaper than anything coming out of Silicon Valley. Technical leadership v.s the American business-led model designed to absorb the maximum amount of [image…
@yishan @yishan on x
R1 isn't the true big disruption. It's just a herald. Here's what I predict will happen: Within 3-6 months, an American company will duplicate the same cost-savings in a similar model. DS literally told everyone how to do it. After that, Deepseek will come out with something so
@archiexzzz Archie Sengupta on x
DeepSeek Engineers [image]
@nathanbenaich Nathan Benaich on x
hot take: deepseek is a blessing for ai startups that can now rip out their costly americano models and have profitable unit economics their investors should be happy, happy as a hippo [image]
@suhail @suhail on x
Looks like DeepSeek just literally did it more efficiently. Game recognize game. [image]
@vrushankdes Vrushank Desai on x
i find it hilarious that Deepseek released pages of details about their parallelism/quantization/etc to stave off haters doubting their training efficiency and.....haters still can't believe how efficient their model is lmaoo [image]
@amasad Amjad Masad on x
So much cope about DeepSeek. Not only did they release a great model. they also released a breakthrough training method (R1 Zero) that's already reproducing. I doubt they lied about training costs, but even if they did they're still awesome for this great gift to the world.
@iamgingertrash @iamgingertrash on x
Sam spent more on this than Deepseek did to train the model that killed OpenAI [image]
@iamgingertrash @iamgingertrash on x
This guy is a product of nepotism and his dad is balls-deep in OpenAI stock Can you smell the cope? Here's his next tweet: “It's patriotic to use OpenAI and you're a communist if you use open source models like Deepseek”
@nxthompson @nxthompson on x
The anger over Deepseek training on ChatGPT's outputs, in violation of TOS, is justified. But it might perhaps also make the AI companies think about the training they did on copyrighted content without consent. https://techcrunch.com/...
@natfriedman Nat Friedman on x
The deepseek team is obviously really good. China is full of talented engineers. Every other take is cope. Sorry.
@hardmaru @hardmaru on x
DeepSeek is a side project 🔥 [image]
@hxiao Han Xiao on x
@abacaj deepseek's holding 幻方量化 is a quant company, many years already,super smart guys with top math background; happened to own a lot GPU for trading/mining purpose, and deepseek is their side project for squeezing those gpus
@michael_kove Michael Kove on x
DeepSeek stole the AI thunder: - with zero hype from CEO, - zero “omg guys it changez everythin” influencers - no swanky demos - no bloated promises - no hints at “AGI achieved internally” They did it by shipping an actual product. [image]
@signulll @signulll on x
kind of hilarious how deepseek is exactly what chatgpt was supposed to be—what openai originally promised—before sam pivoted hard toward profit. china going open source on ai is a wild plot twist—basically throws out every argument @vkhosla & others made. this is like watching
@saboo_shubham_ Shubham Saboo on x
DeepSeek R1 is 100% Opensource and 96.4% cheaper than OpenAI o1 while delivering similar performance. OpenAI o1: $60.00 per 1M output tokens DeepSeek R1: $2.19 per 1M output tokens People with $200 ChatGPT subscription, let that sink in. [video]
@cgarciae88 Cristian Garcia on x
“deepseek is just a hack” “they trained on o1” the cope is unreal
@philippilk Philip Pilkington on x
Maybe, hear me out here, AI was massively overhyped because NVIDIA is one the last remaining viable American hardware companies and Deepseek is just exposing the whole sector as a giant bubble full of capital misallocation and overinvestment. 🤔
@bigblackjacobin Edward Ongweso Jr on x
whether or not deepseek is something, it is funny to see an industry that only exists cause it sucks at the state's teat 24/7 freak out. Silicon Valley only exists because we've committed to misallocation and overinvestment/overvaluation as an industrial policy for Some Reason
@finbarrtimbers Finbarr on x
The DeepSeek papers are remarkable in their level of detail. For instance, DeepSeekMath, which introduced GRPO, goes into reproducible detail on how they created their math corpus from Common Crawl. [image]
@mikepfrank Michael P. Frank on x
Since R1 came out, people are talking like the massive compute farms deployed by Western labs are a waste, BUT THEY'RE NOT — don't you see? This just means that once the best of DeepSeek's clever cocktail of new methods are adopted by GPU-rich orgs, they'll reach ASI even faster.
@davetroy Dave Troy on x
On the one hand, Deepseek is going to prove deeply disruptive to the Silicon Valley ecosystem; but on the other hand it's better we call bullshit on this hype cycle sooner than later, even if it took the CCP to do it. Sam Altman is Elizabeth Holmes 2.0.
@localghost Aaron Ng on x
Here's Deepseek r1 1.5B thinking through a problem — it's comparable to 4o and Claude 3.5 Sonnet in a number of domains like math. Except... it's a 1.5B model... and can run on virtually any hardware. Truly a huge efficiency leap. [video]
@signulll @signulll on x
i've been running deepseek locally (i have a highest end mac studio) for few days, & it's absolutely on par with o1 or sonnet. i've been using it nonstop for coding and other tasks, & what would've cost me a fortune through api's is now completely free. this feels like a total
@orikron @orikron on x
Deepseek had a budget of $5 million and it beat Open AI's model. Open AI has a budget of $5 billion. That's 1000x ROI improvement. The US cannot build a technological moat when it's competing against far superior brains.
@kantrowitz Alex Kantrowitz on x
Serious question, if DeepSeek is this good, what happens to the companies spending billions building models with slightly better performance? [image]
@jenzhuscott Jen Zhu on x
This is actually DeepSeek's culture - give credits to the team and the CEO is invisible. The exact opposite to some.
@epsilontheory Ben Hunt on x
LLMs are operating systems. Open source, locally running, high performance LLMs like DeepSeek R1 are the new Linux and will have the same impact on competitive AI landscape.
@slow_developer Haider on x
FACT it's funny how a chinese company (DeepSeek) forces the US company (openAI) to kneel down and offer their latest model, o3-mini, for free to users agree? [image]
@epsilontheory Ben Hunt on x
I think it's more likely that the US govt tries to ban DeepSeek R1 than TikTok. [image]
@tynervp @tynervp on x
BREAKING: Deepseek rumoured to be training r2 on a previously unknown chip powered purely on american cope.
@thestalwart Joe Weisenthal on x
So funny everyone suddenly realizing that DeepSeek is legit. I've been paying attention to them since Tuesday.
@dioscuri Henry Shevlin on x
Deepseek R1 is ridiculously good. Better than any LLM I've ever used so far, and notably extremely low rates of hallucinations, even on questions that are designed to elicit them.
@shaunrein Shaun Rein on x
In 1 week, from DeepSeek to Red Note, Chinese tech companies have shattered the self-confidence of Silicon Valley Techbros at OpenAI, Facebook, Google wetting their pants I predicted this would happen in my 2014 book The End of Copycat China but SV & media attacked me as a
@tsarnick @tsarnick on x
Perplexity CEO Aravind Srinivas says restrictions on chip imports to China are forcing them to innovate efficient solutions to AI model training, with DeepSeek trained on only 2048 H800 GPUs, making them the equivalent of DOGE for AI [video]
@arithmoquine Henry on x
i've made over 200,000 requests to the deepseek api in the last few hours. zero ratelimiting, and the whole thing cost me like 50 cents. bless the CCP, openai could never
@rnaudbertrand Arnaud Bertrand on x
That's actually a fantastic illustration why Western media's bias on China actually hurts the West more than it does China. Deepseek is only “startling” if you based your understanding on China off reporting by the likes of The Economist, who keep picturing a China as a
@pastynome @pastynome on x
DeepSeek is another example of China rapidly commoditizing new high tech industries and preventing the West from making excess profits or “rent” as it's called. If this keeps going, the standard of living in the West will fall.
@emostaque Emad on x
Deepseek have h800s which are h100s with reduced interconnect which is why they had to come up with lots of optimisations per their latest two papers No export controls on those so nothing to hide (@dylan522p the 50k is your guess?), my guess is they have 10-20k
@deedydas Deedy on x
DeepSeek built a high performance computer on Aug 31 with 10,000 A100 GPUs. A must-read paper only cited ONCE. In their V3 paper, the base for R1, they say they train on 2048 H800s, the export-controlled H100 with 50% the transfer rate. Why didn't they use the 10,000 A100s? [imag…
@thesiriusreport @thesiriusreport on x
Deepseek R1 has made many in the West finally question why everything we produce technologically costs ridiculous amounts of money to do so.
@christiancooper Christian H. Cooper on x
I asked #R1 to visually explain to me the Pythagorean theorem. This was done in one shot with no errors in less than 30 seconds. Wrap it up, its over: #DeepSeek #R1 [video]
@seanmccarthycom Sean Padraig McCarthy on x
Many reasons why China is winning and will win the tech race but a big one is university is affordable in China and it's now a debt slavery scheme here. They have access to their full human capital while in the US only upper class and above get the chance to create next DeepSeek
@matthewclifford Matt Clifford on x
Excellent take. While DeepSeek and the team behind R1 are super impressive, there's a huge amount of over correction going on...
@emollick Ethan Mollick on x
No matter how much you fight it, I find that the visible chain-of-thought from DeepSeek makes it nearly impossible to avoid anthropomorphizing the thing. The visible first-person “thinking” makes you feel like you are reading a diary of a somewhat tortured soul who wants to help …
@silverspookguy @silverspookguy on x
Thank you Chinese Deepseek devs for proving GenAI is a giant scam inflated by capitalists and is actually worth less than $5.5 million. [image]
@anammostarac Ana Mostarac on x
“Deepseek is a ccp state psyop + economic warfare to make american ai unprofitable” is an embarrassingly low agency, defeatist take.
@bigmeaninternet Malcolm Harris on x
Is one of the reasons DeepSeek took such a leap that they embraced technical leadership, vs. American industry which seems mostly business led and designed to absorb a maximum of investment capital?
@8teapi Prakash on x
Deepseek is not a “side project”. At the same time employees are not lying when they say it is. The story they are telling is myth making in the same vein in the Silicon Valley “we want to make the world a better place” but at the same time make billions of dollars. The team [vid…
@deanwball Dean W. Ball on x
The amount of factually incorrect information and hyperventilating takes on deepseek on this website is truly astounding. I assumed that an object-level analysis was unnecessary but apparently I was wrong. Here you go: 1. DeepSeek is an extremely talented team and has been
@mattbruenig Matt Bruenig on x
Maybe trying to keep China from importing technology and thereby forcing them to innovate it better themselves is making them stronger, as with deepseek. Funny situation
@philippilk Philip Pilkington on x
How many examples do we need of export restrictions and sanctions on China driving innovation before policymakers in DC get it through their skulls? Really, it's getting embarassing at this stage. DeepSeek is only the latest leapfrog in part created by these restrictions. 🇨🇳🇺🇸 [i…
@_its_not_real_ @_its_not_real_ on x
Registering prediction: DeepSeek is over fitted to the most common 70% of use cases, trained on synthetic ChatGPT data, and the actual budget given is BS. There is nothing novel to it and once Meta et Al tear into it this will be apparent.
@katewillett Kate Willett on x
China releasing Deepseek a day after this is hilarious.
@jxmnop Jack Morris on x
i guess DeepSeek broke the proverbial four-minute-mile barrier. people used to think this was impossible. and suddenly, RL on language models just works and it reproduces on a small-enough scale that a PhD student can reimplement it in only a few days this year is going to
@silvermanjacob Jacob Silverman on x
DeepSeek just smoked OpenAI by being more innovative and far more efficient, not stealing tech
@jordihays Jordi Hays on x
What TikTok is to 22 year old girls, DeepSeek is to developers. Artificial views and likes are just replaced with artificially low inference costs.
@hesamation @hesamation on x
Respect! Hugging Face 🤗 is reproducing the whole DeepSeek R1 pipeline to be used by open source community. So far it has > GRPO implementation > train and evaluation code > generator for synthetic data [image]
@timhwang Tim Hwang on x
Rumors out today that Deepseek is being aided by a secretive global terrorist organization known only by its sinister alias: Meta
@thestalwart Joe Weisenthal on x
I wrote about how easily and quickly I was able to switch from using ChatGPT to using DeepSeek for my random day-to-day AI queries https://www.bloomberg.com/...
@rushdoshi Rush Doshi on x
Very useful context on DeepSeek. Did they really accomplish all that on just 5 million? Seems not. Probably more like $1 billion.
@andr3jh @andr3jh on x
OpenAI engineers browsing the “our team” section of the DeepSeek website and recognizing their ex-girlfriends [image]
@chiefchimpanzee Darshan Sanghrajka on x
It's pretty amazing that a bunch of quants at a hedge fund in China made Deepseek for $6m and did it as a side project 🤣 Every Western AI bro that has burned BILLIONS just got sideswiped by them. No wonder Sam Altman is out there with red eyes chatting utter nonsense.
@menhguin Minh Nhat Nguyen on x
R1 model aside, did anyone notice that Deepseek app has multimodality, PDF upload and search, which not even O1 pro has rn? [image]
@emostaque Emad on x
This is the crazy thing about conspiracy theories about @deepseek_ai, they open source their models have have fantastic detailed papers! Everyone who paid attention has known how great their work has been and aren't hugely surprised by R1 🐳
@citrini7 @citrini7 on x
deepseek being the brainchild of a chinese hedge fund is such a complete total cultural victory by capitalism that, ideologically at least, it should soften the blow.
@philippilk Philip Pilkington on x
This is what makes the DeepSeek thing so funny. A bunch of grifters have been selling AI secret sauce for years - spooky mystery juice that could never be fully explained. Now a bunch of young guys just wrote a good algo, published it, and the circus tent burned down. 🤖
@qcapital2020 @qcapital2020 on x
So wait wait wait , the founder of DeepSeek is basically the Jim Simons of China and was doing this LLM thing only as a side project and for $6M was able to dethrone every AI company in the world. We are so cooked LOL [image]
@adamemedia Adam on x
China has created one of the world's best AI models for only $6 million, as opposed to the billions spent by Facebook, Google, Microsoft etc. And DeepSeek is open-sourced, while the US models are proprietary and secretive—exposing the West's bloated, profit-driven approach to [im…
@rnaudbertrand Arnaud Bertrand on x
All these posts about Deepseek “censorship” just completely miss the point: Deepseek is Open Source under MIT license which means anyone is allowed to download the model and fine-tune it however they want. Which means that if you wanted to use it to make a model whose purpose is …
@nobleqali Q. Anthony Ali on x
The more I read these responses to China dropping DeepSeek into the public domain, the more I realize this was a deliberate attack on the tech oligopolists. They're freaking out because they've been exposed as overpaid hacks
@hamptonism @hamptonism on x
> be an electric engineering student > team up w/ cracked classmates > start quant trading *we're so cracked* > founded a quant firm in his 30's > makes ¥100B trading with ai/ml *we're even more cracked with ai* > buys thousands of Nvdia GPUs > creates DeepSeek as a side project …
@basedbeffjezos @basedbeffjezos on x
The last thing your AI lab sees before getting one-shotted by a Chinese bootleg LLM team with a $5M cluster [image]
@rnaudbertrand Arnaud Bertrand on x
All benchmarks now confirm it: Deepseek is truly is as good as OpenAI's o1 (which is top of the range) for 3% of the price. Boom. And that's when you want to pay for the API. You can also use it Open Source for “free” (which you can't do with o1). There's no overstating how [imag…
@emostaque Emad on x
The furore over DeepSeep R1 costing $5.5m buried the actual lede.. It didn't. That's how much the base v3 model cost (GPT 4o level). R1 likely cost O($100k), but the real buried lede is the distilled versions of Qwen & Llama which cost O($10k) to tune & can still improve..
@anjneymidha Anjney Midha on x
From Stanford to MIT, deepseek r1 has become the model of choice for America's top university researchers basically overnight
@kevinroose Kevin Roose on x
It's sort of funny that every American tech company is bragging about how much money they're spending to build their models, and DeepSeek is just like “yeah we got there with $47 and a refurbished Chromebook”
@hamandcheese Samuel Hammond on x
This is wrong on several levels. - DeepSeek trains on h100s. Their success reveals the need to invest in export control *enforcement* capacity. - CoT / inference-time techniques make access to large amounts of compute *more* relevant, not less, given the trillions of tokens
@emostaque Emad on x
Deepseek are not faking the cost of the run. It's pretty much in line with what you'd expect given the data, structure, active parameters and other elements and other models trained by other people You can run it independently at the same cost It's a good lab working hard 😓
@schuldensuehner Holger Zschaepitz on x
China's #DeepSeek could represent the biggest threat to US equity markets as the company seems to have built a groundbreaking AI model at an extremely low price and w/o having access to cutting-edge chips, calling into question the utility of the hundreds of billions worth of [im…
@nabeelqu Nabeel S. Qureshi on x
Everyone is way overindexing on the $5.5m final training run number from DeepSeek. - GPU capex probably $1BN+ - Running costs are probably $X00M+/year - ~150 top-tier authors on the v3 technical paper, $50m+/year They're not some ragtag outfit, this was a huge operation.
@avichal @avichal on x
What's more likely? 1 - small group of AI engineers at @deepseek_ai figures out how to beat all of the top researchers in the world as a side project 2 - Chinese government has 100k GPUs they shouldn't have and releases open source models claiming $6m training cost as a psyop
@yacinemtb Kache on x
Closed Source AI is DEAD Do you know how much lawyers you need to employ to even justify using openai's API for some internal use case? It would take your lawyers 6 months of lead time to sort it all out. And by then, deepseek and qwen will have released 2 new models
@firstadopter Tae Kim on x
The probability it cost DeepSeek $6 million of spending (R&D) to create their models is ZERO if you actually read the paper but go ahead with the sensationalist narratives
@samfbiddle Sam Biddle on x
Interesting how often when something Chinese outperforms something American (TikTok, deepseek, cars, etc) it's a “psyop” or “economic warfare” or “digital fentanyl” or a “cyberweapon” and not just “another country made something people like”
@geiger_capital @geiger_capital on x
Ok. Deepseek is just as good, if not better, than OpenAI and costs 3% of the price... It took them 2 months and less than $6 million to build, using reduced-capability chips, while US companies are pouring in hundreds of BILLIONS. So... what happens to the Nasdaq?
r/technology r on reddit
How China's new AI model DeepSeek is threatening U.S. dominance
r/geopolitics r on reddit
How China's new AI model DeepSeek is threatening U.S. dominance
r/LeopardsAteMyFace r on reddit
The people that wanted to replace us with AI got replaced by AI
r/economy r on reddit
How China's AI model DeepSeek is threatening U.S. dominance. (CNBC)

Chronicles

Industry insiders say DeepSeek's focus on research makes it a dangerous competitor as it's willing to share breakthroughs rather than protect them for profits

Related Coverage

Discussion