DeepSeek releases its new flagship models V4 Pro and V4 Flash in preview, saying V4 Pro trails the performance of state-of-the-art models by about 3 to 6 months

Bloomberg 2026-04-25

Discussion

@deepseek_ai @deepseek_ai on x
🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. [ima…
@jukan05 Jukan on x
Thoughts after reading the DeepSeek V4 paper: - NVIDIA really is something else. Remember how back in 2024 people were bashing Blackwell as overspec'd and dismissing FP4 as just marketing? Turns out it was all groundwork for the next generation of models. Maybe NVIDIA's moat is
@nrehiew_ @nrehiew_ on x
Worth thinking about the compute gap. Pretraining compute for DeepSeek v4 is ~1e25 flops. OpenAI has 100K GB200s. Assuming all are used, and with a mere 15% MFU, the pretraining run would complete in just over a day (37 hours)
@mukund M Mohan on x
OMG DeepSeek is turning model efficiency into a weapon. Its 30 - 44% cheaper DeepSeek AI is making LLM it look like an economic design problem: sparse attention, compressed KV cache, fewer activated parameters, FP4/FP8 weights, and reasoning modes that let users trade speed,
@teortaxestex @teortaxestex on x
anon do you realize that V4-Pro is straight up the strongest pretrained model we have? Like... 1.6T@49AB (≈280B dense), 33T - even by meme formula it's > LLaMA 3. Add Muon, mHC, most steps 64K context + extended to 1M... No excuses now. Every “unicorn” can have its brand AGI. [im…
@dorialexander @dorialexander on x
Didn't expect to be hit with the most significant AI paper of the year on breakfast. Longer read coming but what's obvious already: turning a sovereignty play (using Huawei ascend) into an opportunity to reshape hardware. Intercoms, memory, power: wishlist everywhere. [image]
@deanwball Dean W. Ball on x
This is actually the most salient fact about V4, rather than “da bar charts look gud!!!” simpleton logic. It's not actually that competitive of a model, but it is useful substrate for all manner of upstarts, worldwide. This is pro-social, and I wish the US had an equivalent.
@deanwball Dean W. Ball on x
V4's benchmarks don't tell you much, but the revealing thing is DeepSeek can barely serve the model. The fact that a model came out and “da charts looked good!!!” is not a sign of much; thinking otherwise is the logic of apes or, in Arnaud's case, obvious assets.
@synthwavedd Leo on x
my first impressions on deepseek v4: - little disappointing it's not SoTA after all this time, but it's close - new pareto frontier - a lot cheaper than 5.4/opus 4.6 for comparable performance! - new favourite model for creative writing - pretty noticeable big model smell -
@kyleichan Kyle Chan on x
DeepSeek V4 is impressive because it's a near-SOTA model with highly efficient 1 million token context that can run on Huawei's new Ascend 950PR chips. But equally notable is what V4 didn't do: - No mention of training on Chinese AI chips - Still lags behind US frontier models
@antirez @antirez on x
First impressions on DeepSeek v4 pro used via Claude Code. It is great but not so cheap compared to how much tokens you get with the OpenAI 200$ subscription. More or less I burned 1$ per hour of intense usage.
@victor207755822 Deli Chen on x
DeepSeek-V3: Dec 26, 2024 DeepSeek-V4: Apr 24, 2026 484 days later, we humbly share our labor of love. As always, we stay true to long-termism and open source for all. AGI belongs to everyone. ❤️🌍 #DeepSeekV4 #AGIforEveryone #OpenSource
@arena @arena on x
Exciting news - DeepSeek V4 Pro is in the Arena with 1.6T parameters (49B activated) alongside V4 Flash at 284B parameters (13B activated). Both support 1M token context. It's a major leap over DeepSeek V3.2! Code Arena: - DeepSeek V4 Pro (thinking): #3 open model (#14 overall), …
@chrisrmcguire Chris McGuire on x
DeepSeek v4 just dropped. At first glance it does not appear to be the kind of leap that v3 claimed to be in January 2025, nor does it challenge the consensus regarding the state of the U.S.-China AI competition: U.S. models lead by ~7 months, and leading Chinese models remain
@yuchenj_uw Yuchen Jin on x
I'm still amazed that DeepSeek, Kimi, and Qwen can train very strong LLMs with far fewer and often nerfed NVIDIA GPUs, or even Huawei chips. DeepSeek V4 report shows they invent new attention architectures to make training/inference more efficient. Creativity loves constraints.
@mweinbach Max Weinbach on x
Yea deepseek v4 flash/pro don't really perform that well compared to any of the major US models, even 1-2 revisions old. Looks like it's slightly behind Opus 4.5 in practice, and on par or slightly behind Kimi K2.6 Some good optimization techniques there, but overall, eh
@suchenzang Susan Zhang on x
so that explains the delay... deepseek could not fix training instabilities, after doubling from ~15T tokens in v3 to ~33T tokens in v4 the 10+ mentions of “stability” tricks seem to be wildly lacking if these two were the main bandages (mismatched routing + clamping) but kudos f…
@artificialanlys @artificialanlys on x
DeepSeek V4 Pro is the #1 open weights model on GDPval-AA, our agentic real-world work tasks evaluation @deepseek_ai has released V4 Pro (1.6T total / 49B active) and V4 Flash (284B total / 13B active). V4 is DeepSeek's first new size since V3, with all intermediate models [image…
@amasad Amjad Masad on x
While US politicians/lobbyists are scaremongering about “Chinese distillation,” Chinese scientists are actually sharing real AI breakthroughs in the open. These kind of advances have nothing to do with data and benefit everyone, including small (and possibly big) US labs.
@deanwball Dean W. Ball on x
This prediction from a Chinese AI researcher—that the gap between US and Chinese AI is growing larger—matches my own, from well over a year ago now, that DeepSeek r1 marked ~the closest Chinese models would get to the US frontier absent a change in China's access to compute.
@dee_bosa Deirdre Bosa on x
What does frontier pricing power look like when theres a permanent, well-funded, open-source shadow market 3-6 months behind?
@scobleizer Robert Scoble on x
DeepSeek 4 is out. My AI says: +++++ The Numbers That Matter V4 Pro costs $3.48 per million output tokens. Claude Opus 4.6 costs $25. GPT-5.4 costs $15. Same benchmark tier. One fifth the price. ValsAI ran independent tests. V4 is now number 1 on their Vibe Code Benchmark.
@eliebakouch Elie on x
Deepseek V4 Pro is the biggest open model ever with 1.6T total 49B active, trained on 33T tokens, 1M context, with 2 new attention mechanisms, Muon, mHC, open source kernels, FP4 QAT, MIT license and with one of the best tech repot of the year [image]
@mtslive @mtslive on x
DeepSeek promising price reductions for V4-Pro later in the year as Huawei's next generation Atlas 950 SuperPoD multi-rack clusters come online.
@haider1 Haider on x
finally, deepseek v4 is here and the crazy part is that it is now easily comparable to opus 4.7 and gpt-5.5 level models, especially in coding, reasoning, and long-context work deepseek v4-pro has a 1m context window and costs around $1.74/m input and $3.48/m output how they [ima…
@simonw Simon Willison on x
These pelicans are kind of angry looking! Left is deepseek-v4-flash, right is deepseek-v4-pro - both generated using OpenRouter via my LLM tool [image]
@emollick Ethan Mollick on x
Kimi K2.6, for comparison. Honestly don't know why the gap is so large. [image]
@basedjensen @basedjensen on x
I can't state enough how big of a deal this is and they have made everything open source. Whale bros might have saved local inference [image]
@zephyr_z9 @zephyr_z9 on x
Massive compute gap At least 40x-80x
@_arohan_ Rohan Anil on x
DeepSeek Pro is a 1e25 flop run. Flash is a 2.5e24 flops in 6ND.
@emollick Ethan Mollick on x
My first two TiKZ Sparks unicorns from DeepSeek v4. (Expert mode, from the DeepSeek site, which is supposed to be v4 Pro according to the release) [image]
@iamgingertrash @iamgingertrash on x
A Chinese company facing chip restrictions can train this But xAI can't even get SOTA With a million H100 equivalents
@arena @arena on x
DeepSeek v4 lands in the Text Arena: - DeepSeek V4 Pro (thinking): #2 open model (#14 overall), matching Kimi-2.6 - DeepSeek V4 Flash (thinking): #10 open model (#47 overall) Top 10 Text categories: - #1 Medicine & Healthcare (v4 Pro) - #8 Legal & Government (v4 Pro) - #8 Math [i…
@jukan05 Jukan on x
Very interesting. DeepSeek added the following comment with V4: “Due to constraints in high-end compute capacity, the current service capacity for Pro is very limited. After the 950 supernodes are launched at scale in the second half of this year, the price of Pro is expected t…
@valsai @valsai on x
The 🐳 has surfaced and it's a powerhouse on the Vals leaderboards, dominating on coding. DeepSeek V4 just landed #2 on the Vals Index, nearly tying Kimi K2.6 (only 0.07% behind). [image]
@jukan05 Jukan on x
What kind of magic did DeepSeek pull off this time? With V4, they seem to be back at SOTA again. Their coding performance also looks pretty serious. [image]
@mweinbach Max Weinbach on x
Here are the deepseek v4 benchmarks vs. US frontier models [image]
M Mohan M Mohan on linkedin
DeepSeek is turning model efficiency into a weapon. Its 30 - 44% cheaper — DeepSeek AI is making LLM it look like an economic design problem …
Denys Linkov Denys Linkov on linkedin
DeepseekV4 is out, retaining it's MIT licence! We got GPT 5.5 and Deepseek v4 on the same day, some initial thoughts from the technical report: …
Merve Noyan Merve Noyan on linkedin
if you were in a cave, DeepSeek v4 is out, and it's groundbreaking, here's a short post on why: — it's the first open model to have solved long context …
Angus CHENG Zhe Angus CHENG Zhe on linkedin
The long-awaited, cost-efficient DeepSeek V4 unveiled its preview model today, with fully support by Huawei's Ascend supernode. …
Jing Conan Wang Jing Conan Wang on linkedin
DeepSeek v4 might be a turning point for the entire agentic AI stack. — Not just because it's a strong model — but because it shifts two important dynamics: …
Houwei Rui Houwei Rui on linkedin
DeepSeek V4 marks a clear shift in AI: — performance is catching up, but cost is dropping faster. …
Daniel Langkilde Daniel Langkilde on linkedin
DeepSeek v4 is out ❤️ I'm trying to slow-read papers right now. It's tempting to just push them to Claude and ask “what's new”, but that's a slippery slope towards AI-brain-rot. …
@timkellogg.me Tim Kellogg on bluesky
Today is DeepSeek V4 Day — Truly hangs with Opus 4.6 & GPT-5.4 — Pro: 1.4T (49B active) — Flash: 248B / 13B — 1M context using DSA — Optimized for Claude Code — Pro price: $0.14 in / $3.48 out — Flash price: $0.028 in / $0.28 out — Open source has caught the frontier. …
@sn@mastodon.ping.de @sn@mastodon.ping.de on mastodon
#deepseek V4 Flash should run ok on 2x Strix Halo being around 158GB with 13b active parameters. Looking forward to see their innovations in action, in particular the Hybrid attention architecture for FLOPs and KV cache savings. Plus it's only a preview with multi-modality comi…
r/DeepSeek r on reddit
Deepseek-v4 flash and v4 pro
r/DeepSeek r on reddit
Deepseek v4 Released!
r/LocalLLaMA r on reddit
Buried lede: Deepseek v4 Flash is incredibly inexpensive from the official API for its weight category
r/LocalLLaMA r on reddit
Deepseek V4 Flash and Non-Flash Out on HuggingFace
r/singularity r on reddit
DeepSeek V4 has released

Chronicles

DeepSeek releases its new flagship models V4 Pro and V4 Flash in preview, saying V4 Pro trails the performance of state-of-the-art models by about 3 to 6 months

Related Coverage

Discussion