JS Wei (Jack) Sun

DeepSeek-V4 closes the open-weights capability gap — and reopens others

Every URL the pipeline pulled into ranking for this issue — primary sources plus the supporting and contradicting findings each Researcher returned. Inline citations in the issue point back here.

← Back to the issue

Sources

DeepSeek-V4: a million-token context that agents can actually use huggingface.co

(AINews) DeepSeek V4 Pro (1.6T-A49B) and Flash (284B-A13B), Base and Instruct — runnable on Huawei Ascend chips latent.space

The prodigal Tiger returns… but is no longer the benchmarks leader.

References

Artificial Analysis artificialanalysis.ai

DeepSeek V4-Pro is back among the leading open-weights models, ranking #2 on the Intelligence Index behind Kimi K2.6, while leading all open-weights competitors on the GDPval-AA agentic benchmark.

Artificial Analysis – AA-Omniscience artificialanalysis.ai

On the AA-Omniscience benchmark V4-Flash hallucinates 96% of the time and V4-Pro 94% when it doesn’t know the answer; V4-Pro scores -10 on the Omniscience Index, indicating it would rather guess than abstain.

TheNextWeb (Jensen Huang) thenextweb.com

Nvidia CEO Jensen Huang called DeepSeek’s optimization for Huawei Ascend a ‘horrible outcome’ for the United States, conceding that CUDA’s moat is being actively eroded.

SL Guardian / China Academy slguardian.org

DeepSeek was forced to revert to Nvidia hardware for V4 pretraining after Huawei Ascend 910B suffered stability issues and ‘glacial’ interconnect speeds, then shifted to Ascend 950PR Supernodes only for continued training and inference.

Simon Willison’s blog simonwillison.net

DeepSeek V4 ships under a true MIT license — a sharp departure from the custom DeepSeek Model License used for V3 — making it among the most permissive frontier-class open-weight releases to date.

r/singularity discussion of LMArena results reddit.com

DeepSeek V4-Pro underwhelms on Arena’s crowdsourced ranking, sitting only third among open-weight coding models despite vendor claims of GPT-5.5-class performance — community attributes the gap to ‘benchmaxxing’ on SWE-bench-style evals.

Jack Sun

Jack Sun, writing.

Engineer · Bay Area

Hands-on with agentic AI all day — building frameworks, reading what industry ships, occasionally writing them down.

Digest
All · AI Tech · AI Research · AI News
Writing
Essays
Elsewhere
Subscribe
All · AI Tech · AI Research · AI News · Essays

© 2026 Wei (Jack) Sun · jacksunwei.me Built on Astro · hosted on Cloudflare