An evaluation by NIST's CAISI says DeepSeek V4 Pro lags behind leading US AI models by about eight months and is the most capable Chinese AI model to date
In April 2026, the Center for AI Standards and Innovation (CAISI) evaluated the open-weight AI model DeepSeek V4 Pro ("DeepSeek V4").
NIST
Related Coverage
- China is falling behind in the AI race, according to a US government benchmark The Decoder · Matthias Bastian
Discussion
-
@nikostro
Nikita Ostrovsky
on x
DeepSeek V4 has a similar capability to GPT-5, released 8 months ago, according to a new @NIST report. If the current trend continues, we'll see a Chinese model at GPT-5.5 (roughly Mythos-level) model around February 2027. [image]
-
@niubi
Bill Bishop
on x
In April 2026, the Center for AI Standards and Innovation (CAISI) evaluated the open-weight AI model DeepSeek V4 Pro ("DeepSeek V4"). CAISI evaluations indicate that DeepSeek V4's capabilities lag behind the frontier by about 8 months https://www.nist.gov/...
-
@hamandcheese
Samuel Hammond
on x
New composite eval of DeepSeek V4 from CAISI suggests China is falling behind. Notice the relative steepness of their improvement trend. https://www.nist.gov/... [image]
-
@natolambert
Nathan Lambert
on x
So much rests on which of these trend lines is more representative. [image]
-
@alecstapp
Alec Stapp
on x
The export controls are working. Don't let NVIDIA lobbyists tell you any different.
-
@dorialexander
Alexander Doria
on x
The issue is that benchmarks simultaneously undersell and oversell the gap. DeepSeekv4 belongs to an entirely new category of model by side and by design, and the most dramatic step taken this year to bring the open architecture ecosystem closer to frontier.
-
@scaling01
@scaling01
on x
chinese models are ~8 months behind and are falling further behind [image]
-
@emollick
Ethan Mollick
on x
This is a good explanation of why the gap between open and closed models is larger than it appears in benchmarks. I would add in that current open models are also more fragile than closed: they handle out-of-distribution problems far less well & have lower emergent capabilities.
-
@scaling01
@scaling01
on x
The AI model gap is bigger than you think
-
@scaling01
@scaling01
on x
@rasbt They all cluster around DeepSeek-V4. I doubt it would change the entire trend
-
@rasbt
Sebastian Raschka
on x
@scaling01 Since GPT 5.5 is on that chart, it would have been interesting to include GLM 5.1 and Kimi K2.6 as well (and perhaps Qwen3.6 Max)