DeepSeek releases its new flagship models V4 Pro and V4 Flash in preview, saying V4 Pro trails the performance of state-of-the-art models by about 3 to 6 months

DeepSeek rolled out preview versions of a new flagship artificial intelligence model a year after upending Silicon Valley …

Bloomberg 2026-04-24

Discussion

@deepseek_ai @deepseek_ai on x
🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. [ima…
@suchenzang Susan Zhang on x
so that explains the delay... deepseek could not fix training instabilities, after doubling from ~15T tokens in v3 to ~33T tokens in v4 the 10+ mentions of “stability” tricks seem to be wildly lacking if these two were the main bandages (mismatched routing + clamping) but [image]
@eliebakouch Elie on x
Deepseek V4 Pro is the biggest open model ever with 1.6T total 49B active, trained on 33T tokens, 1M context, with 2 new attention mechanisms, Muon, mHC, open source kernels, FP4 QAT, MIT license and with one of the best tech repot of the year [image]
@arena @arena on x
Exciting news - DeepSeek V4 Pro is in the Arena with 1.6T parameters (49B activated) alongside V4 Flash at 284B parameters (13B activated). Both support 1M token context. It's a major leap over DeepSeek V3.2! Code Arena: - DeepSeek V4 Pro (thinking): #3 open model (#14 overall), …
@emollick Ethan Mollick on x
My first two TiKZ Sparks unicorns from DeepSeek v4. (Expert mode, from the DeepSeek site, which is supposed to be v4 Pro according to the release) [image]
@iamgingertrash @iamgingertrash on x
A Chinese company facing chip restrictions can train this But xAI can't even get SOTA With a million H100 equivalents
@mweinbach Max Weinbach on x
Here are the deepseek v4 benchmarks vs. US frontier models [image]
@basedjensen @basedjensen on x
I can't state enough how big of a deal this is and they have made everything open source. Whale bros might have saved local inference [image]
@emollick Ethan Mollick on x
Kimi K2.6, for comparison. Honestly don't know why the gap is so large. [image]
@amasad Amjad Masad on x
While US politicians/lobbyists are scaremongering about “Chinese distillation,” Chinese scientists are actually sharing real AI breakthroughs in the open. These kind of advances have nothing to do with data and benefit everyone, including small (and possibly big) US labs.
@dee_bosa Deirdre Bosa on x
What does frontier pricing power look like when theres a permanent, well-funded, open-source shadow market 3-6 months behind?
@yuchenj_uw Yuchen Jin on x
I'm still amazed that DeepSeek, Kimi, and Qwen can train very strong LLMs with far fewer and often nerfed NVIDIA GPUs, or even Huawei chips. DeepSeek V4 report shows they invent new attention architectures to make training/inference more efficient. Creativity loves constraints.
@mtslive @mtslive on x
DeepSeek promising price reductions for V4-Pro later in the year as Huawei's next generation Atlas 950 SuperPoD multi-rack clusters come online.
@simonw Simon Willison on x
These pelicans are kind of angry looking! Left is deepseek-v4-flash, right is deepseek-v4-pro - both generated using OpenRouter via my LLM tool [image]
@jukan05 Jukan on x
What kind of magic did DeepSeek pull off this time? With V4, they seem to be back at SOTA again. Their coding performance also looks pretty serious. [image]
@haider1 Haider on x
finally, deepseek v4 is here and the crazy part is that it is now easily comparable to opus 4.7 and gpt-5.5 level models, especially in coding, reasoning, and long-context work deepseek v4-pro has a 1m context window and costs around $1.74/m input and $3.48/m output how they [ima…
@victor207755822 Deli Chen on x
DeepSeek-V3: Dec 26, 2024 DeepSeek-V4: Apr 24, 2026 484 days later, we humbly share our labor of love. As always, we stay true to long-termism and open source for all. AGI belongs to everyone. ❤️🌍 #DeepSeekV4 #AGIforEveryone #OpenSource
@valsai @valsai on x
The 🐳 has surfaced and it's a powerhouse on the Vals leaderboards, dominating on coding. DeepSeek V4 just landed #2 on the Vals Index, nearly tying Kimi K2.6 (only 0.07% behind). [image]
@chrisrmcguire Chris McGuire on x
DeepSeek v4 just dropped. At first glance it does not appear to be the kind of leap that v3 claimed to be in January 2025, nor does it challenge the consensus regarding the state of the U.S.-China AI competition: U.S. models lead by ~7 months, and leading Chinese models remain
@_arohan_ Rohan Anil on x
DeepSeek Pro is a 1e25 flop run. Flash is a 2.5e24 flops in 6ND.
@scobleizer Robert Scoble on x
DeepSeek 4 is out. My AI says: +++++ The Numbers That Matter V4 Pro costs $3.48 per million output tokens. Claude Opus 4.6 costs $25. GPT-5.4 costs $15. Same benchmark tier. One fifth the price. ValsAI ran independent tests. V4 is now number 1 on their Vibe Code Benchmark.
@arena @arena on x
DeepSeek v4 lands in the Text Arena: - DeepSeek V4 Pro (thinking): #2 open model (#14 overall), matching Kimi-2.6 - DeepSeek V4 Flash (thinking): #10 open model (#47 overall) Top 10 Text categories: - #1 Medicine & Healthcare (v4 Pro) - #8 Legal & Government (v4 Pro) - #8 Math [i…
@jukan05 Jukan on x
Very interesting. DeepSeek added the following comment with V4: “Due to constraints in high-end compute capacity, the current service capacity for Pro is very limited. After the 950 supernodes are launched at scale in the second half of this year, the price of Pro is expected [im…
@mweinbach Max Weinbach on x
Yea deepseek v4 flash/pro don't really perform that well compared to any of the major US models, even 1-2 revisions old. Looks like it's slightly behind Opus 4.5 in practice, and on par or slightly behind Kimi K2.6 Some good optimization techniques there, but overall, eh
@zephyr_z9 @zephyr_z9 on x
Massive compute gap At least 40x-80x
r/DeepSeek r on reddit
Deepseek-v4 flash and v4 pro
r/DeepSeek r on reddit
Deepseek v4 Released!
r/singularity r on reddit
DeepSeek V4 has released
r/LocalLLaMA r on reddit
Buried lede: Deepseek v4 Flash is incredibly inexpensive from the official API for its weight category
r/LocalLLaMA r on reddit
Deepseek V4 Flash and Non-Flash Out on HuggingFace
@simonw Simon Willison on x
More of my notes on DeepSeek V4 - the really big news is the pricing: both DeepSeek-V4-Flash and DeepSeek-V4-Pro are the cheapest models in their categories while benchmarking close to the frontier models from other providers https://simonwillison.net/... [image]
@simonwillison.net Simon Willison on bluesky
DeepSeek V4 just dropped - two models, Flash and Pro, both benchmarking well, decent pelicans and prices that put them both as the cheapest in their respective categories by a solid margin simonwillison.net/2026/Apr/24/ ... [images]
@sino_market @sino_market on x
HUAWEI ASCEND SUPERNODE TO SUPPPORT DEEPSEEK V4 : HUAWEI STATEMENT #CHINA #TECH #AI #HUAWEI #DEEPSEEK (https://mktnews.com/...)
@macrobombastic @macrobombastic on x
@jukan05 Bro, that's a pretty transparent statement from DeepSeek. Good on 'em for being upfront about their limitations.
@teortaxestex @teortaxestex on x
DeepSeek is trolling They don't say *what* they trained on, only “eh MXFP4” Hawks will now go on a hunt for Blackwells in Inner Mongolia In a year they'll publish “insights into V4” with details of an experimental Huawei cluster man I love this lab so much [image]
@teortaxestex @teortaxestex on x
> The library supports SM90 (Hopper) and SM100 (Blackwell), with no Huawei Ascend support I wanted to say “TileLang doesn't even support Ascend” but apparently there is a variant still, I repeat that all this “DeepSeek spent 9000 years adapting for Huawei” is most likely bullshit…
@poezhao0605 Poe Zhao on x
Buried in the fine print: DeepSeek says V4-Pro throughput is currently limited by high-end compute supply. Prices will drop significantly once Huawei Ascend 950 super nodes ship at scale in H2. DeepSeek is publicly tying its API economics to domestic chip infrastructure. That's […
@zichenwanghere Zichen Wang on x
DeepSeek 4 upon Huawei's Ascend chipsis apparently being released in a live event via Bilibili, China's equivalent to Youtube, today at 7pm Beijing Time on April 24. [image]
@sgg_trader @sgg_trader on x
@jukan05 Chinese from China said they are “forced” to use Huawei .
@sheriyuo Xiuyu Li on x
This is the inevitable outcome of locally deploying domestically produced (localized) cards, and it will definitely bring the prices down.
@himself65 @himself65 on x
@jukan05 They have to use hw, for political propaganda.
@ErikJonker@mastodon.social Erik Jonker on mastodon
Do not only look at benchmarks of AI models. Costs are also very important and the differences are big. In the end that is very important for businesses using AI at scale. Picture is from this excellent blog/post from @simon — https://simonwillison.net/... #AI #deepseekv4 #…
Houwei Rui Houwei Rui on linkedin
DeepSeek V4 marks a clear shift in AI: — performance is catching up, but cost is dropping faster. …
Daniel Langkilde Daniel Langkilde on linkedin
DeepSeek v4 is out ❤️ I'm trying to slow-read papers right now. It's tempting to just push them to Claude and ask “what's new”, but that's a slippery slope towards AI-brain-rot. …
Merve Noyan Merve Noyan on linkedin
if you were in a cave, DeepSeek v4 is out, and it's groundbreaking, here's a short post on why: — it's the first open model to have solved long context …
Angus CHENG Zhe Angus CHENG Zhe on linkedin
The long-awaited, cost-efficient DeepSeek V4 unveiled its preview model today, with fully support by Huawei's Ascend supernode. …
Jing Conan Wang Jing Conan Wang on linkedin
DeepSeek v4 might be a turning point for the entire agentic AI stack. — Not just because it's a strong model — but because it shifts two important dynamics: …
@deanwball Dean W. Ball on x
This is actually the most salient fact about V4, rather than “da bar charts look gud!!!” simpleton logic. It's not actually that competitive of a model, but it is useful substrate for all manner of upstarts, worldwide. This is pro-social, and I wish the US had an equivalent.
@deanwball Dean W. Ball on x
V4's benchmarks don't tell you much, but the revealing thing is DeepSeek can barely serve the model. The fact that a model came out and “da charts looked good!!!” is not a sign of much; thinking otherwise is the logic of apes or, in Arnaud's case, obvious assets.
@synthwavedd Leo on x
my first impressions on deepseek v4: - little disappointing it's not SoTA after all this time, but it's close - new pareto frontier - a lot cheaper than 5.4/opus 4.6 for comparable performance! - new favourite model for creative writing - pretty noticeable big model smell -
@kyleichan Kyle Chan on x
DeepSeek V4 is impressive because it's a near-SOTA model with highly efficient 1 million token context that can run on Huawei's new Ascend 950PR chips. But equally notable is what V4 didn't do: - No mention of training on Chinese AI chips - Still lags behind US frontier models
@dorialexander @dorialexander on x
Didn't expect to be hit with the most significant AI paper of the year on breakfast. Longer read coming but what's obvious already: turning a sovereignty play (using Huawei ascend) into an opportunity to reshape hardware. Intercoms, memory, power: wishlist everywhere. [image]
@antirez @antirez on x
First impressions on DeepSeek v4 pro used via Claude Code. It is great but not so cheap compared to how much tokens you get with the OpenAI 200$ subscription. More or less I burned 1$ per hour of intense usage.
@deanwball Dean W. Ball on x
This prediction from a Chinese AI researcher—that the gap between US and Chinese AI is growing larger—matches my own, from well over a year ago now, that DeepSeek r1 marked ~the closest Chinese models would get to the US frontier absent a change in China's access to compute.
@teortaxestex @teortaxestex on x
anon do you realize that V4-Pro is straight up the strongest pretrained model we have? Like... 1.6T@49AB (≈280B dense), 33T - even by meme formula it's > LLaMA 3. Add Muon, mHC, most steps 64K context + extended to 1M... No excuses now. Every “unicorn” can have its brand AGI. [im…
@artificialanlys @artificialanlys on x
DeepSeek V4 Pro is the #1 open weights model on GDPval-AA, our agentic real-world work tasks evaluation @deepseek_ai has released V4 Pro (1.6T total / 49B active) and V4 Flash (284B total / 13B active). V4 is DeepSeek's first new size since V3, with all intermediate models [image…
@sn@mastodon.ping.de @sn@mastodon.ping.de on mastodon
#deepseek V4 Flash should run ok on 2x Strix Halo being around 158GB with 13b active parameters. Looking forward to see their innovations in action, in particular the Hybrid attention architecture for FLOPs and KV cache savings. Plus it's only a preview with multi-modality comi…
r/LocalLLaMA r on reddit
No Multimodality yet in DeepSeek-V4. But I'll wait.
@thdxr Dax on x
deepseek v4 can run on huawei chips another step towards self reliance thanks to our own policies can't believe how many times we'll make this mistake
Denys Linkov Denys Linkov on linkedin
DeepseekV4 is out, retaining it's MIT licence! We got GPT 5.5 and Deepseek v4 on the same day, some initial thoughts from the technical report: …
@timkellogg.me Tim Kellogg on bluesky
Today is DeepSeek V4 Day — Truly hangs with Opus 4.6 & GPT-5.4 — Pro: 1.4T (49B active) — Flash: 248B / 13B — 1M context using DSA — Optimized for Claude Code — Pro price: $0.14 in / $3.48 out — Flash price: $0.028 in / $0.28 out — Open source has caught the frontier. …
@mukund M Mohan on x
OMG DeepSeek is turning model efficiency into a weapon. Its 30 - 44% cheaper DeepSeek AI is making LLM it look like an economic design problem: sparse attention, compressed KV cache, fewer activated parameters, FP4/FP8 weights, and reasoning modes that let users trade speed,
M Mohan M Mohan on linkedin
DeepSeek is turning model efficiency into a weapon. Its 30 - 44% cheaper — DeepSeek AI is making LLM it look like an economic design problem …

Chronicles

DeepSeek releases its new flagship models V4 Pro and V4 Flash in preview, saying V4 Pro trails the performance of state-of-the-art models by about 3 to 6 months

Related Coverage

Discussion