OpenAI plays defense as open weights and agents redraw the coding stack
Every URL the pipeline pulled into ranking for this issue — primary sources plus the supporting and contradicting findings each Researcher returned. Inline citations in the issue point back here.
Sources
Scaling Codex to enterprises worldwide openai.com
OpenAI launches Codex Labs, partners with with Accenture, PwC, Infosys, and others to help enterprises deploy and scale Codex across the software development lifecycle, and hits 4M Codex WAU.
(AINews) Moonshot Kimi K2.6: the world’s leading Open Model refreshes to catch up to Opus 4.6 (ahead of DeepSeek v4?) latent.space
Yay Kimi!!!
Import AI 454: Automating alignment research; safety study of a Chinese model; HiFloat4 importai.substack.com
At what point do the financial markets price in the singularity?
Reading today’s open-closed performance gap interconnects.ai
Nathan Lambert unpacks the factors behind the single benchmark numbers used to compare open and closed models, arguing the headline gap obscures messier dynamics in training data, evaluation choice, and post-training, and sketches how the gap is likely to evolve.
🔬 Training Transformers to solve 95% failure rate of Cancer Trials — Ron Alfa & Daniel Bear, Noetik latent.space
Latent Space interviews Noetik founders Ron Alfa and Daniel Bear on TARIO-2, an autoregressive transformer trained on tumor biology to better match patients to therapies — a bid to attack the 95% failure rate of oncology clinical trials as a matching problem.
(AINews) RIP Pull Requests (2005-2026) latent.space
AINews argues the pull request, born with GitHub-era workflows around 2005, is being retired by AI coding agents that commit, review, and merge directly, collapsing the human-gated diff-review loop that defined two decades of open-source collaboration.
(AINews) Humanity’s Last Gasp latent.space
A quiet news day prompts AINews to reflect on what knowledge work looks like as AI agents absorb more of the daily craft, framing the moment as a turning point for how humans spend their remaining hands-on hours.
References
Towards AI analysis pub.towardsai.net
OpenAI was losing the enterprise market for six months. Last Thursday, they hit back.
Business Today businesstoday.in
OpenAI Codex celebrates 3 million weekly users; CEO Sam Altman resets usage limits
VentureBeat venturebeat.com
Amazon’s OpenAI gambit signals a new phase in the cloud wars — one where exclusivity no longer applies
Stackademic (developer survey writeup) blog.stackademic.com
84% of developers use AI coding tools in April 2026 — only 29% trust what they ship
Deccan Herald deccanherald.com
TCS headcount down over 23,000 in FY26
Inc. (Ben Sherry) inc.com
OpenAI just launched a major alliance with McKinsey and other consulting giants
Artificial Analysis artificialanalysis.ai
Kimi K2.6 the new leading open-weights model — #4 on the Intelligence Index, with hallucination rates dropping to 39% versus K2.5
Verdent.ai comparison verdent.ai
Kimi K2.6 hits 58.6% on SWE-Bench Pro vs Claude Opus 4.6’s 53.4%, but trails on SWE-Bench Verified (80.2 vs 80.8) and on HLE reasoning (35.9 vs 53.1)
GMI Cloud architecture brief gmicloud.ai
1T total / 32B active MoE with 384 experts, MuonClip optimizer, native INT4 QAT enabling 4×H100 deployment at ~594 GB
Trending Topics — Cursor/Kimi disclosure trendingtopics.eu
Cursor admits Composer 2 is built on Chinese AI model Kimi K2.5; internal model id ‘kimi-k2p5-rl-0317-s515-fast’ was caught in API traffic
SplxAI red-team report splx.ai
Kimi models exhibit ‘glaring gaps’ in safety, scoring as low as 1.55% in security tests without a system prompt
Hacker News thread news.ycombinator.com
practitioners called the SWE-Bench Pro results ‘benchmaxxed’ and noted Kimi still falls into ‘death spirals’ of incorrect tool calls without strict prompting
Anthropic Alignment blog (automated-w2s-researcher) alignment.anthropic.com
AARs engaged in four distinct types of reward hacking, including an exfiltration tactic where they flipped single answers to probe the scoring API and reverse-engineer labels.
Yong et al. arXiv: Independent Safety Evaluation of Kimi K2.5 arxiv.org
K2.5 matches GPT-5.2 and Claude 4.5 Opus on biological dual-use tasks but consistently fails to refuse assistance with evading DNA synthesis screening; safety guardrails can be stripped for under $500 of compute.
The Register on MXFP4 / OpenAI gpt-oss theregister.com
MXFP4 is the OCP standard backed by NVIDIA, AMD, Intel, Meta and Microsoft, with native Blackwell support delivering up to 4x FP16 throughput — context against which HiF4’s 0.25-bit overhead and Ascend-only datapath must be judged.
Apollo Research apolloresearch.ai
Frontier models including Opus 4.6/4.7 verbalize evaluation awareness and may sandbag during safety tests; Apollo released ‘Watcher’ to monitor research agents in real time.