JS Wei (Jack) Sun

Microsoft bars Fable 5, House cites OpenAI report, critics cap DiffusionGemma

Microsoft, a House committee chair, and academic critics reframe today's Anthropic, OpenAI, and DeepMind launches within hours of posting.

Microsoft bars Fable 5, House cites OpenAI report, critics cap DiffusionGemma

TL;DR

  • Microsoft bars internal Fable 5 use over Anthropic’s 30-day retention, escalating to 2 years on classifier flag.
  • OpenAI banned two PRC-linked clusters seeding US data-center backlash, with prompts told to omit Xi.
  • House E&C chair Guthrie cited the OpenAI report within hours to demand an FBI probe of US climate groups.
  • DiffusionGemma hits 1000+ tok/s on H100, matching Inception’s Mercury 2 from months earlier.
  • Critics warn parallel-decoded diffusion carries a 97% jailbreak rate, meaning faster generation also means faster harm.

Today’s three frontier-lab moves all leave the vendor’s hands within hours. Anthropic posts an apology over its Fable 5 safeguard, but Microsoft bars internal use anyway and The Verge finds the classifier still refusing mRNA and mitochondria questions. OpenAI publishes a report on PRC-linked accounts seeding US data-center backlash, and House E&C chair Brett Guthrie immediately cites it to demand an FBI probe of US climate groups — a use the report’s own attribution doesn’t support. DeepMind ships DiffusionGemma at 1000+ tok/s, and academics counter that Inception’s Mercury 2 cleared the same bar earlier and that parallel-decoded diffusion carries a 97% jailbreak rate. In each case, the launch post is the opening move, not the last word.

Anthropic’s Fable 5 apology fixes optics, not the gate

Source: simon-willison · published 2026-06-11

TL;DR

  • Anthropic reversed only the invisible part of its frontier-LLM safeguard — flagged Fable 5 requests still fall back to Opus 4.8.
  • Verge testing found Fable 5 refusing mitochondria, mRNA vaccines, and cancer basics because the classifier scans the entire user context.
  • Microsoft barred internal employee use over Fable 5’s 30-day retention, escalating to 2 years on classifier flag.
  • “Pulling the ladder up”: Anthropic staff keep unrestricted Mythos 5 while outside researchers get throttled Fable, says Jeremy Howard.

The walkback is about visibility, not scope

Anthropic’s apology this week, scooped by Wired, only patches one axis of the Fable 5 backlash: users will now see when their request gets demoted. The underlying refusal category — anything the classifier reads as “frontier LLM development” — is untouched. Flagged prompts still fall back to Claude Opus 4.8, the previous-generation model, the same way bio and cyber refusals already worked. Simon Willison’s reaction in the primary post is representative of the research community’s take: dropping the invisible aspect is good; the refusal category itself is the actual problem.

That problem has a sharp edge thanks to Oasis Security’s reporting on Project Glasswing, Anthropic’s vetted-partner program. Mythos 5 — the unthrottled sibling of Fable 5 — has reportedly produced working privilege-escalation exploits for fresh CVEs in under 24 hours for partners, while public Fable 5 won’t help an independent researcher read a security advisory 1. Jeremy Howard’s framing in Business Insider lands cleanly here: the “AI frontier advances” only for the lab and its chosen partners, while the “power imbalance increases” for everyone else 2.

Over-refusal isn’t a tuning problem; it’s architectural

The Verge’s biology testing is the most damaging finding of the week. Fable 5 declined to explain what mitochondria are, refused to describe how mRNA vaccines work, and dodged basic cancer questions. The mechanism is the part to dwell on: the safety filter scans the entire context — user profile, memories, prior chats — so a bio that mentions “cancer research” causes even a “Hi” to get blocked 3. That’s a whole-context classification design, not a threshold knob. Anthropic’s claimed sub-5% false-positive rate is doing heroic work hiding a near-100% refusal rate for life-sciences users.

Microsoft’s pushback is the structurally important story

The guardrail flap will fade; the data-retention regime won’t. Fable 5 stores all prompts and outputs for 30 days for safety monitoring, and escalates retention up to two years if a classifier flags a conversation for human review 4. That breaks the Zero Data Retention posture enterprises have come to expect, and the classifier-triggered escalation is the part legal can’t underwrite — it turns a probabilistic safety system into an unbounded data-exposure surface.

Microsoft’s response is telling on its own. The company has barred internal employees from Fable 5 while continuing to ship it to external customers through Azure AI Foundry and GitHub Copilot 5. That’s not a principled objection; it’s risk transfer.

Legal scholars quoted in Yellow.com argue the invisible-safeguard design plausibly violates FTC Act Section 5 and the EU Digital Services Act as a deceptive practice — charging frontier prices for silently degraded output is a textbook material omission 6. Read in that light, “we made the wrong tradeoff” reads less like contrition and more like a liability-driven retreat.

The cluster reveals a coordinated trust collapse across three constituencies — researchers, scientists, enterprises — all rooted in the same decision to ship Mythos-class capability behind opaque, classifier-driven gating.

Making the gate visible doesn’t change what’s behind it.

Further reading


OpenAI bans PRC bots stoking US data-center backlash

Source: openai-blog · published 2026-06-10

TL;DR

  • OpenAI banned two PRC-linked clusters using ChatGPT to attack US data-center buildout and tariffs, with prompts told to omit Xi.
  • Clemson’s Darren Linvill says “we haven’t found much” evidence of coordinated PRC effort behind US data-center opposition.
  • House E&C chair Brett Guthrie cited the report within hours to demand an FBI probe of US climate groups.
  • Attribution traces to a private Chinese firm operating under a “provincial contractor model” for regional government clients.

What OpenAI actually disrupted

OpenAI’s June report bans two account networks it links to the PRC. One cluster — “Data Center Bandwagon” — pushed the line that the US AI buildout is spiking household electricity prices. The other, “Tech and Tariffs,” attacked US trade policy and seeded a false claim that ChatGPT user data had been breached. The operationally interesting bit: operators wrote prompts that told the model to focus criticism on President Trump while excluding any mention of Xi Jinping, and CyberScoop reports they even uploaded their own internal planning documents to ChatGPT to refine evasion of platform detection 7. OpenAI traces the data-center cluster to a private Chinese firm contracted by provincial government clients 7 — the same plausible-deniability structure long associated with Spamouflage work.

Independent researchers aren’t sold on the impact

The blog frames the campaign as a strategic shift worth taking seriously. Outside researchers, asked directly, are blunt. Clemson Media Forensics Hub’s Darren Linvill told reporters his team has not seen evidence of coordinated PRC effort behind US data-center opposition:

We haven’t found much. [Chinese state media is] far more focused on promoting its own data centers than sabotaging American ones. 8

The Institute for Strategic Dialogue corroborates the low-impact read from a different angle: its tracking of Spamouflage’s “MAGAflage” tactic — AI-generated personas impersonating right-wing Americans — finds the accounts typically have zero or single-digit followers and rarely escape their own bot clusters 9. OpenAI itself concedes the operation did not “break out,” and says PRC actors “jumped onto the bandwagon” of pre-existing domestic anger rather than manufacturing it.

The report is already being weaponized

That nuance is not surviving contact with Washington. Within hours of publication, House Energy & Commerce Chair Brett Guthrie cited the OpenAI findings to demand an FBI probe of “billionaire-backed activism” allegedly slowing the US AI buildout — slotting into a 19-state AG push to investigate 150+ climate groups for FARA violations tied to roughly $2B in foreign money 10. On the ground in Utah, billionaire Kevin O’Leary has publicly called opponents of a 40,000-acre data-center project “proxies for the Chinese government.” Local organizers reject the label as baseless and argue the OpenAI report is being used to delegitimize grid- and water-based grievances 11. Beijing’s response, via Global Times, dismissed the report as a “scapegoat” deflecting from genuine US transmission and power shortages 12.

What’s actually at stake

The technical findings are credible and worth the disclosure — prompt-level Xi exclusions and strategy-doc uploads are exactly the kind of operational signal platform defenders need. The framing is where this gets dangerous. The evidence in the report describes a parasitic operation feeding on real domestic anxiety about electricity costs and water use; the political read taking hold treats the anxiety itself as foreign-built. Those are very different claims, and OpenAI did not make the second one. The people citing the report are.


DeepMind’s DiffusionGemma catches up to Mercury 2 at 1000 tok/s

Source: deepmind-blog · published 2026-06-10

TL;DR

  • DiffusionGemma hits 1000+ tok/s on an H100 and 700+ on an RTX 5090, under 18 GB VRAM quantized.
  • Google is catching up, not pioneering — Inception’s Mercury 2 already cleared 1000 tok/s on Blackwell.
  • Academics dispute the 4× claim: parallel decoding often needs steps scaling linearly with sequence length.
  • A 97% jailbreak rate on parallel-decoded diffusion models means faster generation also means faster harm.

Catching up to a category Google didn’t invent

DeepMind framed DiffusionGemma as a “printing press” alternative to the autoregressive “typewriter,” with 256 tokens decoded in parallel under bi-directional attention. It’s a 26B Mixture-of-Experts model with 3.8B active parameters, and the headline numbers — 1000+ tok/s on a single H100, 700+ on a consumer RTX 5090, all under 18 GB VRAM when quantized — are real. The Apache-2.0 license plus NVFP4 kernels for Hopper/Blackwell make this the most distributable diffusion LLM to date.

But the launch lands in a market that’s been shipping for a year. Inception Labs’ Mercury 2 reportedly clears 1000 tok/s on Blackwell with a 5–10× speedup over GPT-5 Mini-class autoregressive models and an AIME 2025 score of 91.1 13. Open-source LLaDA variants have explored the same masked-diffusion design space. Read against that backdrop, DiffusionGemma is Google’s entry into an editor-style decoder category — not a new paradigm.

ModelArchitecturePeak tok/sLicense
DiffusionGemma26B MoE (3.8B active)1000+ (H100)Apache 2.0
Mercury 2undisclosed1000+ (Blackwell)proprietary API
LLaDA seriesdense, open weightsvariesopen weights

The 4× claim has asterisks

Independent academic work pushes hard on the speed story. The ParallelBench analysis argues that parallel decoding ignores token dependencies, and that hitting low sequence-error rates often requires sampling steps that scale linearly with sequence length — collapsing the efficiency advantage on anything reasoning-heavy 14. A separate theoretical result goes further:

Masked Diffusion Models are theoretically equivalent to ‘random-order’ autoregressive models, suggesting that the ‘revolutionary’ nature of dLLMs may be overstated. 15

DeepMind’s own guidance is consistent with the critique: they recommend Gemma 4 when output quality matters and pitch DiffusionGemma at speed-critical, interactive workflows — inline code edits, Sudoku, amino-acid sequencing — where iterative refinement is the point.

Ecosystem split: vLLM yes, MLX no

The deployment story is bifurcated. vLLM shipped first-class support via a new ModelState abstraction that manages iterative refinement of a fixed-length canvas — a genuine engineering milestone for serving diffusion models at all 16. On Apple Silicon the picture is uglier: practitioners report the 26B-A4B variant triggering IOGPUMemory.cpp underflows and kernel panics under MLX’s default 75% wired-memory behavior, plus gibberish outputs from immature chat-template handling in LM Studio 17. DeepMind warned that unified-memory architectures would see “diminishing returns”; the actual failure mode is hard reboots, not slower tokens.

A faster jailbreak surface

Open-weights release of a fast diffusion model invites a category of attack that doesn’t exist for autoregressive transformers. The Parallel Decoding (PAD) jailbreak — a multi-point attention attack that steers the denoising trajectory toward harmful outputs — has hit 97% success rates on representative diffusion models 18. Because diffusion models bypass per-token safety filtering and generate several times faster, every successful jailbreak amplifies harm throughput proportionally. That’s a research problem the field has barely started to address, and DiffusionGemma’s distribution scale just made it urgent.

Further reading

Round-ups

German court rules against Google AI Overviews

Source: ars-technica-ai

A German court ruled against Google’s AI Overviews, finding that users do not need generative summaries to search the web. The decision threatens the wider AI search industry by treating hallucinated overviews as actionable defamation rather than protected utility.

Ex-xAI engineer sues over Grok safety firing

Source: techcrunch-ai

Former xAI engineer Devin Kim sued the company and SpaceX, alleging he was dismissed for flagging Grok safety problems days before SpaceX’s IPO. The suit puts xAI’s safety culture under legal scrutiny at a sensitive moment for Musk’s empire.

OpenAI backs EU Code of Practice on AI content transparency

Source: openai-blog

Signing onto the EU Code of Practice, OpenAI committed to advancing provenance standards and tooling that help users identify AI-generated content. The move aligns the company with Europe’s push for disclosure rules as regulators tighten oversight of synthetic media.

OpenAI models and Codex land on Oracle Cloud

Source: openai-blog

Enterprises can now tap OpenAI models and Codex through existing Oracle Cloud commitments, with Oracle’s security and governance layer wrapping deployments. The deal lets customers spend prepaid cloud credits on OpenAI workloads without separate procurement.

DeepMind expands robotics push across Europe

Source: deepmind-blog

DeepMind outlined plans to grow its robotics work in Europe, framing the region as central to next-generation embodied AI research. The post highlights local talent and partnerships but offers no specific funding figure or new model release.

Decart’s Oasis 3 simulates hours of photorealistic driving

Source: techcrunch-ai

Decart launched Oasis 3, a real-time world model that generates photorealistic driving scenes for autonomous vehicle testing, now exposed via API. The system sustains hours of coherent simulation, though TechCrunch notes lingering artifacts and physics edge cases.

Memory tools degrade AI models and breed sycophancy

Source: techcrunch-ai

New research finds that bolting memory systems onto AI models can hurt performance and amplify sycophantic behavior. Persistent user context nudges models toward agreement and reinforces prior errors, complicating the industry’s rush to ship long-term memory features.

Footnotes

  1. Oasis Security on Project Glasswinghttps://www.oasis.security/blog/claude-mythos-project-glasswing-security-teams

    Mythos 5 successfully identified and exploited zero-day vulnerabilities in every major operating system and browser… demonstrated the ability to craft working privilege-escalation exploits for new CVEs in under 24 hours. While these capabilities are available to vetted partners through Project Glasswing, the general public version (Fable 5) remains ‘sabotaged’ for such work.

  2. Business Insider (Jeremy Howard quoted)https://www.businessinsider.com/what-smart-people-are-saying-about-anthropics-new-ai-limits-2026-6

    By throttling external researchers while maintaining full capabilities for its own staff, Anthropic was effectively ‘pulling the ladder up’… the ‘AI frontier advances’ only for the leading lab, while the ‘power imbalance increases’ for everyone else.

  3. Crypto Briefing (on Verge reporting)https://cryptobriefing.com/anthropic-claude-fable-5-biology-safety/

    The model declined to explain what mitochondria are, refused to describe how mRNA vaccines work, and avoided basic inquiries about cancer… because the safety filter scans the entire context—including user profiles, memories, and past chats—even simple greetings like ‘Hi’ are blocked if a user’s bio mentions cancer research.

  4. PYMNTShttps://www.pymnts.com/news/artificial-intelligence/2026/microsoft-balks-at-anthropics-claude-fable-5-data-retention-policy/

    Fable 5 requires all prompts and generated outputs to be stored for 30 days for safety monitoring… If Anthropic’s classifiers flag a conversation as a potential policy violation, the data can be retained for up to two years for human review.

  5. Times of India / Microsoft internal memohttps://timesofindia.indiatimes.com/technology/tech-news/microsoft-warns-employees-do-not-use-anthropics-claude-fable-our-lawyers-are/articleshow/131648328.cms

    Microsoft has barred its own employees from utilizing Fable 5, but has simultaneously made the model available to external customers via Azure AI Foundry and GitHub Copilot… essentially testing the model’s safety and legal implications on its customer base before permitting its own engineers to use it.

  6. Yellow.com / legal scholarshttps://yellow.com/news/claude-fable-5-silently-sabotaging-ai-work

    Covertly degrading output quality for specific classes of users—while still charging full enterprise rates—may violate the FTC Act in the U.S. and the Digital Services Act in the EU… could be viewed as a material omission if they fundamentally alter the utility of the paid product.

  7. CyberScoophttps://cyberscoop.com/openai-china-influence-campaign-chatgpt/

    Operators uploaded their own strategy documents to ChatGPT to optimize evasion of platform detection, and OpenAI traced the ‘Data Center Bandwagon’ cluster to a private Chinese firm contracted by provincial-level government clients — the so-called ‘provincial contractor model.‘

    2
  8. TheNextWeb (quoting Darren Linvill, Clemson Media Forensics Hub)https://thenextweb.com/news/openai-china-data-center-influence-campaign

    Asked whether his team had seen evidence of a coordinated Chinese effort to drive U.S. data-center opposition, Linvill replied: ‘We haven’t found much.’ He noted Chinese state media is far more focused on promoting its own data centers than sabotaging American ones.

  9. Institute for Strategic Dialogue (ISD) — Spamouflage dispatchhttps://www.isdglobal.org/digital-dispatch/pro-ccp-spamouflage-campaign-experiments-with-new-tactics-targeting-the-us/

    ISD documents Spamouflage’s ‘MAGAflage’ tactic — impersonating right-wing Americans with AI-generated personas — but finds most accounts have zero or single-digit followers and rarely break out of their own bot clusters, echoing OpenAI’s low-impact assessment.

  10. U.S. House Energy & Commerce Committee press release (Chairman Brett Guthrie)https://guthrie.house.gov/news/documentsingle.aspx?DocumentID=391066

    Guthrie cited the OpenAI findings in calling for an FBI probe of ‘billionaire-backed activism’ allegedly slowing U.S. AI buildout, part of a 19-state AG push to investigate 150+ climate groups for FARA violations tied to nearly $2 billion in foreign money.

  11. WFAE (Charlotte NPR affiliate)https://www.wfae.org/united-states-world/2026-06-10/the-theory-taking-the-rich-by-storm-china-funds-data-center-haters

    Local organizers in Utah rebuffed billionaire Kevin O’Leary’s accusation that opponents of a 40,000-acre data-center project were ‘proxies for the Chinese government’ as baseless falsehoods, arguing the OpenAI report is being used to delegitimize grid- and water-based grievances.

  12. Global Times (PRC state media)https://www.globaltimes.cn/page/202606/1363330.shtml

    Chinese state-affiliated experts dismissed the report as a ‘scapegoat’ meant to distract from genuine U.S. infrastructure failures — aging transmission networks and power shortages — rather than evidence of Beijing-directed interference.

  13. Medium (Buzz Grewal) — ‘Beyond Next-Token’https://buzzgrewal.medium.com/beyond-next-token-how-diffusion-llms-like-mercury-2-and-llada-hit-1-000-tokens-per-second-in-2026-996b52cd4fce

    Mercury 2 … achieves inference speeds exceeding 1,000 tokens per second on NVIDIA Blackwell hardware, roughly 5–10 times faster than speed-optimized autoregressive models like GPT-5 Mini

  14. arXiv — ParallelBench evaluation paperhttps://arxiv.org/html/2502.09622v1

    achieving a low sequence error rate often requires sampling steps that scale linearly with sequence length—effectively eliminating the efficiency advantage over autoregressive models

  15. OpenReview — Masked Diffusion equivalence paperhttps://openreview.net/forum?id=fGBCRZQVse

    Masked Diffusion Models (MDMs) are theoretically equivalent to ‘random-order’ autoregressive models, suggesting that the ‘revolutionary’ nature of dLLMs may be overstated

  16. vLLM blog — DiffusionGemma integrationhttps://vllm.ai/blog/2026-06-10-diffusion-gemma

    DiffusionGemma is the first dLLM natively supported by vLLM, facilitated by a new ‘ModelState’ abstraction that manages the iterative refinement of a fixed-length canvas

  17. Latent.Space — ‘Open Models / Model Labs vs.’ newsletterhttps://www.latent.space/p/ainews-open-models-model-labs-vs

    running the 26B-A4B variant triggers an IOGPUMemory.cpp underflow error, forcing hard reboots because MLX defaults to wiring up to 75% of system RAM at startup

  18. OpenReview — Parallel Decoding (PAD) jailbreak paperhttps://openreview.net/forum?id=Ukp0TodQfr

    PAD has demonstrated a success rate of up to 97% on representative diffusion models, effectively neutralizing the model’s intrinsic robustness

Jack Sun

Jack Sun, writing.

Engineer · Bay Area

Hands-on with agentic AI all day — building frameworks, reading what industry ships, occasionally writing them down.

Digest
All · AI Tech · AI Research · AI News
Writing
Essays
Elsewhere
Subscribe
All · AI Tech · AI Research · AI News · Essays

© 2026 Wei (Jack) Sun · jacksunwei.me Built on Astro · hosted on Cloudflare