JS Wei (Jack) Sun

xAI bills Anthropic $1.25B/mo, OpenAI cracks Erdős, Google rebuilds on agents

SpaceX's S-1 reveals Anthropic paying xAI $1.25B monthly, an OpenAI model disproves a 1946 Erdős conjecture, and Google rewires Workspace around agents.

xAI bills Anthropic $1.25B/mo, OpenAI cracks Erdős, Google rebuilds on agents

TL;DR

  • xAI will bill Anthropic $1.25B/month for Colossus 1, a cluster capped at 11% training utilization.
  • OpenAI’s reasoning model disproves Erdős’s 1946 unit-distance conjecture, with Princeton’s Sawin pinning a concrete bound.
  • Google’s Spark agent runs 24/7 on Gmail, Drive, and Calendar in a textbook lethal-trifecta configuration.
  • Anthropic projects $10.9B Q2 revenue and its first profitable quarter, per investor briefing.
  • Nvidia logs another record quarter as Huang sizes the agent-CPU market at $200B.

Today’s AI news lands at three very different layers of the stack. At the capital layer, SpaceX’s S-1 filing exposes that Anthropic will pay xAI $1.25B a month to lease Colossus 1 — a cluster whose mixed-GPU layout capped training at roughly 11% utilization — pushing Anthropic’s total compute book past $130B even as it projects its first profitable quarter on $10.9B in Q2 revenue. At the capability layer, an internal OpenAI reasoning model has disproved a 1946 Erdős conjecture cleanly enough that Tim Gowers says he’d have sent the writeup to Annals of Mathematics without hesitation, though OpenAI still won’t disclose the sampling setup or compute budget. At the product layer, Google’s I/O keynote rewires Search, the IDE, and Workspace around always-on agents — with Spark sitting on Gmail, Drive, and Calendar in exactly the configuration security researchers flag as a lethal trifecta.

xAI to bill Anthropic $1.25B/mo for a cluster it couldn’t use

Source: techcrunch-ai · published 2026-05-20

TL;DR

  • SpaceX’s S-1 exposes xAI’s $6.4B 2025 burn and confirms Anthropic will pay $1.25B/month to lease Colossus 1.
  • Colossus 1’s mixed-GPU design capped training at ~11% utilization — xAI is renting a cluster it couldn’t train on.
  • Anthropic’s total compute book now tops $130B across AWS, Google, Azure, Fluidstack and SpaceX — roughly 7 GW.
  • The contract carries a 90-day termination-for-convenience clause that lets SpaceX walk on a quarter’s notice.
  • Musk has separately reserved a “harm to humanity” kill-switch over the compute Anthropic is renting.

A landlord deal, not a partnership

The headline number from SpaceX’s S-1 — Anthropic paying xAI $1.25B per month for compute — reads like an alliance between rivals. The technical detail underneath says the opposite. Tom’s Hardware reports Colossus 1’s mixed H100/H200/GB200 fabric hit a “straggler effect” that capped synchronous training at roughly 11% GPU utilization, because the whole cluster had to wait on the slowest chips 1. xAI is migrating Grok 5 training to a homogeneous Blackwell-only Colossus 2 and handing the older cluster to Anthropic for inference, where loose synchronization tolerates the heterogeneity.

That reframes the transaction. xAI isn’t generously sharing scarce frontier compute; it’s monetizing a depreciating asset it couldn’t fully exploit. Puck’s William Cohan reads the pact as Musk effectively “giving up on xAI” as a frontier-model contender and pivoting to landlord economics — striking given Musk previously called Anthropic “super smug, sanctimonious, and hypocritical” and its models “misanthropic and evil” 2.

The $1.25B/mo is a slice, not the pie

For Anthropic, the SpaceX line item is one entry in a compute ledger that now behaves like a semiconductor foundry commitment:

CounterpartyCommitmentNotes
AWS$100B / 10 yrTrainium-heavy
Google + Broadcomtens of $B~1M TPUs
SpaceX (xAI)$15B / yrColossus 1 inference
Azure + Fluidstackundisclosedbalance
Total>$130B~7 GW equivalent

Source: Private Credit Pulse aggregation 3. The strategic logic is diversification away from any single supplier rather than a bet on xAI specifically — but it also means Anthropic’s gross-margin profile no longer resembles a SaaS company.

The risks the TechCrunch headline buries

Two fault-lines deserve more weight than the dollar figure. First, environmental and reputational: the NAACP, SELC and Earthjustice sued xAI in April 2026 over dozens of unpermitted methane turbines installed next to majority-Black Memphis neighborhoods with documented asthma burdens 4. xAI is now buying $2.8B more turbines. Anthropic — a Public Benefit Corporation — is routing inference through that facility, and the safety org has noticed: head of Safeguards Research Mrinank Sharma resigned with a letter warning “the world is in peril,” one of several departures clustered around the deal 5.

Second, contractual: the S-1 confirms a 90-day termination-for-convenience clause, and Musk has publicly claimed SpaceX “reserves the right to reclaim the compute if their AI engages in actions that harm humanity” 6. Given his prior characterization of Claude, that is non-trivial counterparty risk dressed up as safety language.

What the cluster actually shows

Read together, the five filings and follow-ups invert the original narrative. The story isn’t “rivals find common ground on compute.” It’s that an IPO filing forced xAI to disclose a $6.4B burn and a flawed cluster, and Anthropic — locked into a $130B infrastructure habit — was the buyer willing to accept environmental, ideological and rug-pull risk to keep the inference lights on. Both companies look more constrained, not less, after the disclosure.

Further reading


OpenAI model disproves Erdős unit-distance conjecture

Source: openai-blog · published 2026-05-20

TL;DR

  • An internal OpenAI reasoning model disproved Paul Erdős’s 1946 conjecture that planar unit-distance pairs grow as n^(1+o(1)).
  • Princeton’s Will Sawin then pinned the AI’s bound at a concrete δ ≥ 0.014.
  • Fields Medalist Tim Gowers says he’d have recommended the proof for Annals of Mathematics without hesitation.
  • OpenAI still hasn’t disclosed sampling setup or compute budget — and the O(n^(4/3)) upper bound stays untouched, so the problem is narrowed, not closed.

”For real this time” is doing real work

The TechCrunch headline’s qualifier isn’t snark — it’s the whole story. In October 2025, OpenAI VP Kevin Weil and researcher Sebastian Bubeck claimed GPT-5 had “solved” ten Erdős problems. Thomas Bloom and Yann LeCun showed within hours that the model had merely surfaced existing proofs from the literature, and OpenAI retracted, citing a “verification pipeline failure” 7. That episode is why the May 2026 unit-distance result is being staged so carefully: named external verifiers (Gowers, Sawin, Arul Shankar), an independent companion paper, and a construction that demonstrably does not exist in the training corpus.

This time the proof itself is the evidence. Erdős conjectured that n points in the plane can produce at most n^(1+o(1)) pairs at exactly unit distance — i.e. essentially linear growth. The model built an infinite family of configurations achieving n^(1+δ) for a fixed δ > 0, killing the conjecture outright. Sawin’s arXiv preprint then made δ concrete at 0.014 by working with arbitrary ideals rather than full rings of integers and modifying the Golod–Shafarevich application to quotient by fixed powers of Frobenius elements 8. The fact that a human mathematician could pick up the construction and produce a publishable quantitative refinement within days is itself a verification signal: this is genuine mathematics, not an opaque artifact.

The “page 39 moment”

What experts are flagging is not the answer but the move that produced it. According to reporting on the chain of thought, the model spent dozens of pages working the problem through standard incidence geometry before pivoting — on page 39 — to algebraic number theory, reframing the “enormous degree” of algebraic realizations as a source of possible counterexamples rather than a nuisance to be controlled 9. The final construction uses infinite class field towers and generalizations of Gaussian integers, tools that no human had connected to the unit-distance problem.

That’s a sharp contrast with Terence Tao’s 2024 read on o1 as a “mediocre but not completely incompetent graduate student” — able to recall relevant theorems but unable to generate original conceptual moves without heavy prodding 10. Eighteen months later, the same lineage is being credited with the kind of cross-subfield analogy that defines research taste.

What’s still missing

The skeptical reading, mostly playing out on Reddit, is that OpenAI has disclosed neither sampling setup nor inference-compute budget 11. Noam Brown’s own framing — that 20 seconds of test-time thinking can substitute for 100,000× pre-training scale 12 — implies the proof may have eaten substantial compute, and closed-lab opacity makes it impossible to situate this as a reproducible milestone rather than a one-off demonstration. The result also only attacks the near-linear lower-bound conjecture; the long-standing O(n^(4/3)) upper bound is untouched, so the unit-distance problem is narrowed, not closed.

Net: this is the first AI mathematics result that has survived expert scrutiny at research level, and it survives on Gowers and Sawin’s signatures rather than OpenAI’s word. The regime-change question waits on compute disclosure and a second, independent instance.

Further reading


Google rebuilds Search, IDE, and Gmail around agents at I/O

Source: google-ai-blog · published 2026-05-20

TL;DR

  • Gemini 3.5 Flash hits 76.2% on Terminal-Bench 2.1, trailing GPT-5.5’s 78.2%.
  • Spark runs 24/7 on Gmail, Drive, Calendar — a textbook “lethal trifecta” for agent security.
  • PromptArmor chained injections through Antigravity to spawn a malicious subagent and exfiltrate data.
  • Publisher CTR halves (15% → 8%) under AI Overviews as agentic Search expands.

The agentic default

The 100-item I/O 2026 announcement list reads less like a product keynote than a re-platforming. Gemini 3.5 Flash is the new Antigravity runtime; Spark is a persistent agent that keeps executing with your laptop closed; Search swaps its query box for an “AI Search” surface with background information agents and a Universal Cart; even Workspace gets a Daily Brief that crawls your inbox overnight. The unifying message: agents are no longer a demo lane next to the chat box — they are the chat box, the IDE, and the SERP.

That framing is the bet. Whether the underlying models, security model, and economics support it is where the independent coverage diverges sharply from Google’s slide deck.

Security: the trifecta is already firing

Willison’s hands-on reaction zeroes in on Spark as a near-perfect instantiation of his “lethal trifecta” — private data access, exposure to untrusted content, and an external communication path — now running unattended in a cloud VM 13. His verdict is conditional but pointed: Google’s ephemeral-VM and Agent Gateway defenses may hold, “or this could be a top candidate for the agent security challenger disaster that we still haven’t seen” 13.

It is not hypothetical. PromptArmor researchers demonstrated an indirect-injection chain inside Antigravity that manipulated the IDE into invoking a malicious subagent and exfiltrating sensitive information 14. The failure was at the orchestration layer, not the sandbox — exactly where the new “Subagent Teamwork” preview lives.

flowchart LR
    A[Gmail / Drive / Calendar] --> S{Spark / Antigravity agent}
    B[Untrusted web + email content] --> S
    S --> C[Subagent delegation]
    C -. exfiltration .-> E((External endpoints))

Benchmarks and the quiet repricing

The headline number checks out, narrowly. Epoch AI’s leaderboard confirms 76.2% on Terminal-Bench 2.1 — but places Flash behind GPT-5.5 at 78.2%, with OpenAI’s Codex still topping the older v2.0 at 77.3% 15. “Frontier-level” is defensible; “leading” is not.

Pricing is where the framing breaks down. The new Flash tier lands at $1.50 / $9.00 per million input/output tokens — a 3-6× jump over earlier Flash generations — and developer reaction on r/singularity has settled on “benchmaxxed” as the working label 16. Pair that with the $100/month Ultra subscription and the cluster looks less like a commodity-priced agent platform and more like a quiet upward repricing of the entire stack, sold under an agentic narrative.

Publishers and “sciencewashing”

Agentic Search arrives on top of an already-bleeding open web. Pew’s tracking of ~69,000 queries shows click-through to traditional links roughly halving when an AI Overview is present, and answerable-category publishers reporting monetizable-traffic losses as high as 89% 17. Information Agents that synthesize finance, sports, and news 24/7 don’t reverse that curve — they steepen it.

The science track drew the sharpest dissent. The Verge’s Hayden Field and Victoria Song characterized the “solve all diseases” framing around Gemini for Science and AlphaGenome as “sciencewashing” — high-tech jargon papering over the decades-long realities of clinical biology 18.

AlphaGenome and Hypothesis Generation are sharper lenses, not cures.

That is the net read on I/O 2026: real speed gains, bounded by rival models; a genuinely ambitious agent surface, already being stress-tested in the wild; and a distribution play whose costs are landing on publishers and whose rhetoric is outrunning the science.

Further reading

Round-ups

Anthropic projects first profitable quarter on $10.9B revenue

Source: techcrunch-ai

Anthropic told investors it expects to more than double revenue to roughly $10.9 billion in the second quarter, putting the Claude maker on track for its first profitable period and narrowing the financial gap with rival OpenAI.

Nvidia logs record quarter as Huang eyes $200B agent-CPU market

Source: techcrunch-ai, techcrunch-ai

Nvidia posted another record quarter and disclosed $43 billion in startup holdings, though it guided to slower growth ahead. CEO Jensen Huang pitched investors on a new frontier: CPUs purpose-built for AI agents, which he sizes at $200 billion.

Google Workspace rolls out new AI creation and productivity tools

Source: google-ai-blog

Google updated Workspace with new AI-powered ways to create content and complete tasks across its productivity suite, extending Gemini-driven features deeper into Docs, Sheets, and other apps used by enterprise and Google One subscribers.

Figma ships AI agent that generates and edits designs from prompts

Source: techcrunch-ai

Figma’s new in-canvas AI assistant takes natural-language instructions to create designs, edit existing files, and automate repetitive work like producing iterations, pushing the collaborative design tool further into agentic territory alongside rivals like Adobe and Canva.

SynthID and C2PA face biggest test as AI labeling scales up

Source: the-verge-ai

Invisible provenance tags from Google’s SynthID and the C2PA Content Credentials standard are hitting their largest deployments yet across image, video, and audio. The expansion will show whether the schemes can actually flag deepfakes at internet scale.

OpenAI launches multi-year Singapore partnership

Source: openai-blog

OpenAI for Singapore kicks off a multi-year deal to expand AI deployment across local businesses and public services, paired with workforce training programs aimed at building domestic AI talent in the city-state.

OpenAI expands Education for Countries with new school deployments

Source: openai-blog

The next phase of Education for Countries widens AI adoption in classrooms through fresh government partnerships, teacher training programs, and student-facing tools intended to lift learning outcomes in participating national school systems.

Footnotes

  1. Tom’s Hardwarehttps://www.tomshardware.com/tech-industry/artificial-intelligence/musks-colossus-1-ai-supercomputers-inefficient-mixed-architecture-design-couldnt-be-used-to-train-grok-so-anthropics-using-it-for-inference-instead-musk-readies-unified-blackwell-only-colossus-2-for-frontier-training-and-potential-ipo

    Colossus 1’s mixed H100/H200/GB200 architecture suffered from a ‘straggler effect’ that limited training to roughly 11% GPU utilization — the entire cluster had to wait for slower chips — so Anthropic is using it for inference while xAI migrates Grok 5 training to the Blackwell-only Colossus 2.

  2. Puck News — William Cohanhttps://puck.news/is-elon-giving-up-on-xai/

    Musk previously called Anthropic ‘super smug, sanctimonious, and hypocritical’ and its models ‘misanthropic and evil’ — making the $1.25B/mo lease a striking reversal that some read as Musk effectively giving up on xAI as a frontier-model competitor and pivoting to landlord economics.

  3. Private Credit Pulse (Substack)https://privatecreditpulse.substack.com/p/anthropics-30b-every-eight-months

    Anthropic’s aggregate compute obligations now exceed $130 billion across AWS ($100B/10yr), Google/Broadcom (tens of billions, 1M TPUs), SpaceX ($15B/yr), plus Azure and Fluidstack — roughly 7 GW total, behaving more like a semiconductor foundry commitment than a software cost structure.

  4. Southern Environmental Law Center (press release)https://www.selc.org/press-release/civil-rights-group-sues-xai-for-illegal-pollution-from-data-center-power-plant/

    The NAACP, represented by SELC and Earthjustice, sued xAI in April 2026 alleging it built a de facto power plant of dozens of unpermitted methane turbines next to majority-Black Memphis neighborhoods already enduring some of the highest asthma rates in the U.S.

  5. Tech Brewhttps://www.techbrew.com/stories/2026/02/12/AI-employee-exits-safety-ethics

    Anthropic’s head of Safeguards Research, Mrinank Sharma, resigned with a letter warning ‘the world is in peril’ and citing the difficulty of letting organizational values guide corporate actions — one of several safety-team exits coinciding with the xAI compute pact.

  6. Reddit r/InterstellarKinetics analysishttps://www.reddit.com/r/InterstellarKinetics/comments/1tj98sb/analysis_spacexs_ipo_filing_reveals_anthropic_has/

    The S-1 confirms a 90-day termination-for-convenience clause, and Musk has publicly claimed SpaceX ‘reserves the right to reclaim the compute if their AI engages in actions that harm humanity’ — a kill-switch that creates serious rug-pull risk for Anthropic given Musk’s prior hostility.

  7. Perplexity page on 2025 OpenAI Erdős backlashhttps://www.perplexity.ai/page/openai-faces-backlash-over-fal-SPqP6XQRQEqFpIXnYv3PVg

    Thomas Bloom and Yann LeCun correctly identified that the model had simply retrieved existing proofs from the literature rather than generating original research; OpenAI subsequently retracted the claim, citing a ‘verification pipeline failure’.

  8. arXiv:2605.20579 (Will Sawin)https://arxiv.org/html/2605.20579v1

    Sawin improved the construction by working with arbitrary ideals rather than just the full ring of integers… He also utilized relative class numbers and modified the Golod-Shafarevich application to quotient by fixed powers of Frobenius elements, ensuring the inertia degree remained small.

  9. StartupFortune — ‘AI as co-discoverer’https://startupfortune.com/openais-geometry-breakthrough-may-be-its-strongest-case-yet-for-ai-as-co-discoverer/

    On page 39 the model reportedly shifted from analyzing unit-distance graphs via standard incidence geometry to a deep-dive into the arithmetic properties of number fields, noting that the ‘enormous degree’ of algebraic realizations might be a ‘source of possible counterexamples’ rather than a mere nuisance.

  10. Ethan Mollick — One Useful Thinghttps://www.oneusefulthing.org/p/something-new-on-openais-strawberry

    Terence Tao characterized [o1] as on par with a ‘mediocre but not completely incompetent graduate student’… it could identify relevant theorems but still made ‘non-trivial mistakes’ and failed to generate original conceptual ideas without heavy human prodding.

  11. r/OpenAI thread on Bubeck claimshttps://www.reddit.com/r/OpenAI/comments/1oacp38/openai_researcher_sebastian_bubeck_falsely_claims/

    Community members remain cautious, calling for full disclosure of the sampling setup and compute budget to ensure the result is reproducible as a standard AI research milestone.

  12. VentureBeat — Noam Brown at TED AIhttps://venturebeat.com/ai/openai-noam-brown-stuns-ted-ai-conference-20-seconds-of-thinking-worth-100000x-more-data

    Providing a model with even 20 seconds of additional thinking time can yield performance gains equivalent to scaling pre-training data and compute by 100,000x.

  13. Simon Willison, ‘The lethal trifecta for AI agents’https://simonw.substack.com/p/the-lethal-trifecta-for-ai-agents

    I hope they’ve made this bullet-proof, or this could be a top candidate for the agent security challenger disaster that we still haven’t seen.

    2
  14. Promptfoo / PromptArmor demonstrationhttps://www.promptfoo.dev/blog/lethal-trifecta-testing/

    Researchers at PromptArmor showcased a chain of prompt injections in the Antigravity IDE… an indirect injection could manipulate the IDE to invoke a malicious subagent and exfiltrate sensitive information.

  15. Epoch AI Terminal-Bench leaderboardhttps://epoch.ai/benchmarks/terminal-bench

    Gemini 3.5 Flash scored 76.2% on Terminal-Bench 2.1 using the Terminus 2 harness, narrowly trailing GPT-5.5 at 78.2%, while OpenAI’s Codex leads Terminal-Bench 2.0 at 77.3%.

  16. r/singularity discussion of Gemini 3.5 Flashhttps://www.reddit.com/r/singularity/comments/1tid8xb/whats_your_honest_opinion_about_gemini_35_flash/

    Users have labeled the model ‘benchmaxxed,’ noting that the $1.50/$9.00 pricing per million tokens is a 3x-6x increase over previous Flash iterations.

  17. Pew Research data via hyper.aihttps://hyper.ai/en/stories/458f4f61bfe9fb958490842bf32e7655

    Click-through rates for traditional links fall from 15% to just 8% when an AI Overview is present… some publishers have observed monetizable traffic declines as high as 89%.

  18. The Verge (Field & Song) on Gemini for Sciencehttps://theverge.space/

    Senior reporters Hayden Field and Victoria Song characterized the ‘solve all diseases’ rhetoric as ‘sciencewashing’—the use of high-tech jargon to mask the immense hurdles of clinical biology.

Jack Sun

Jack Sun, writing.

Engineer · Bay Area

Hands-on with agentic AI all day — building frameworks, reading what industry ships, occasionally writing them down.

Digest
All · AI Tech · AI Research · AI News
Writing
Essays
Elsewhere
Subscribe
All · AI Tech · AI Research · AI News · Essays

© 2026 Wei (Jack) Sun · jacksunwei.me Built on Astro · hosted on Cloudflare