OpenAI and rivals back DNA fines, Anthropic files at $965B, ChatGPT gets memory
Three frontier-lab moves today land in separate arenas: a Congress biosecurity ask, an IPO filing at $965B, and ChatGPT's memory launch.
OpenAI and rivals back DNA fines, Anthropic files at $965B, ChatGPT gets memory
TL;DR
- OpenAI, Anthropic, Google, Microsoft, Scale CEOs back $750K DNA-screening fines in a Congress letter.
- Anthropic files confidential IPO at $965B after ARR jumps from $9B to $47B in 5 months.
- OpenAI’s Dreaming V3 ships background memory synthesis on a vendor-stated 82.8% recall figure.
- MIT NANDA finds 95% of enterprise GenAI pilots return zero ROI across $30-40B in spend.
- Trump executive order directs frontier-AI safety testing at agencies DOGE just gutted.
Three frontier-lab moves anchor today’s AI news, and each lands in a different arena. OpenAI, Anthropic, Google DeepMind, Microsoft, and Scale AI CEOs co-signed a June 4 letter pressing Congress to pass S.3741, fining DNA synthesis firms up to $750,000 for unscreened orders. Anthropic filed confidentially for an IPO at $965B after ARR jumped from ~$9B to $47B in five months. And OpenAI shipped Dreaming V3, a background memory synthesis architecture, to ChatGPT Plus and Pro.
The outside view is loud across all three. Biosecurity insiders — one of them a letter signatory — say DNA screening targets the wrong defensive layer and can’t catch AI-evasive sequences. Ed Zitron flags Anthropic booking gross revenue through AWS and Google Cloud, inflating ARR 25-35% versus OpenAI’s net basis. And the Dreaming launch leans on a vendor-stated 82.8% recall, with the prior version landing near 52.9% on independent LOCOMO testing. A Trump executive order on frontier-AI safety testing rounds out the day, directing evaluations at agencies DOGE just gutted of the staff to run them.
OpenAI and rivals back $750K fines for DNA screening lapses
Source: openai-blog · published 2026-06-04
TL;DR
- OpenAI, Anthropic, Google DeepMind, Microsoft, and Scale AI CEOs co-signed a June 4 letter to Congress on bioweapon risk.
- The ask: pass S.3741, fining DNA synthesis firms up to $750,000 for unscreened orders.
- OpenAI’s o3 scored 43.8% on the Virology Capabilities Test vs. 22.1% for PhD virologists — the letter’s load-bearing stat.
- Biosecurity insiders — including a signatory — warn DNA screening can’t catch AI-evasive sequences, targeting the wrong defensive layer.
A coordinated agenda drop, not two stories
On June 4 the normally feuding frontier labs lined up behind a single ask. Sam Altman (OpenAI), Dario Amodei (Anthropic), Demis Hassabis (Google DeepMind), Mustafa Suleyman (Microsoft AI), and Alex Wang (Scale AI) co-signed an open letter to Congress, joined by Nobel laureates and the CEOs of gene-synthesis firms like Twist Bioscience. Their concrete demand: pass the Biosecurity Modernization and Innovation Act (S.3741), a Cotton–Klobuchar bill imposing civil penalties up to $750,000 on synthesis providers that fail to screen orders or verify customers 1. OpenAI’s “Biodefense in the Intelligence Age” plan landed the same morning, repackaging the same threat model into a vendor-side counterpart — Trusted Access tiers, red-teaming protocols, and CAISI cooperation around GPT-Rosalind, the reasoning model OpenAI shipped for life-sciences workflows in April.
Read as one event, the cluster is agenda-setting: a vendor action plan and a CEO letter pointing at the same bill, justified by the same benchmark numbers.
The benchmark doing all the work
The “knowledge barrier is eroding” framing leans almost entirely on SecureBio’s Virology Capabilities Test. OpenAI’s o3 hit 43.8% accuracy, well above the 22.1% PhD-virologist baseline, and beat 94% of experts on questions inside their own sub-specialty 2. GPT-Rosalind extends the picture into end-to-end research: 0.751 on BioXBench and outperforming human experts on 95% of RNA sequence prediction tasks for Dyno Therapeutics 3. OpenAI itself concedes the model hallucinates enough that a secondary “Validation as a System” pipeline has to verify citations and experimental claims before anything ships 3 — a notable caveat given the model is being held up as the reason Congress should act now.
Where the consensus cracks
The sharpest pushback comes from inside the signatory tent. Stanford’s David Relman — himself on the letter — warns that the same models the bill targets can already coach users on how to “alter an order so that even those that are screening may be much less able to detect it” 4. If AI-designed variants routinely evade synthesis-provider screeners, S.3741 mandates the wrong defensive layer: it fines the screener for missing sequences the attacker’s model was explicitly optimized to disguise.
Critics characterize the proposed mandates as “cartel-like gatekeeping.” 5
The second fault-line is structural. Smaller biotechs and academic labs see Trusted Access plus mandatory screening as a “biological moat” that concentrates capability inside a handful of Western incumbents while OpenAI acts as a private regulator deciding who counts as “trusted” 6. Open-source advocates go further: lumping model distillation and open weights in with bioweapon-uplift risk could, under a safety banner, ban legitimate research 5.
What’s actually at stake
The threat model is widely accepted; the remedy is not. Congress is being asked to act on numbers that justify some intervention but don’t obviously justify this one — and the loudest dissent isn’t from AI skeptics but from the biosecurity researchers and small-lab scientists the bill claims to protect.
Further reading
Anthropic files for IPO at $965B amid ARR accounting doubts
Source: techcrunch-ai · published 2026-06-04
TL;DR
- Anthropic filed confidentially for an IPO after ARR jumped from ~$9B to $47B in five months.
- Ed Zitron: Anthropic books gross revenue through AWS and Google Cloud, inflating ARR 25-35% vs OpenAI’s net basis.
- The $1.25B/month xAI compute deal is a 180-day lease with 90-day mutual termination, not a strategy.
- MIT’s Project NANDA: 95% of enterprise GenAI pilots return zero ROI across $30-40B in spend.
The accounting question behind $47B
At Bloomberg Tech, Daniela Amodei spent her stage time defending Anthropic’s growth story and brushing off worries about enterprise AI returns. The headline number — annualized revenue going from $9B at end of 2025 to $47B in May 2026 — is the thing investors will price the IPO on. It’s also the number independent analysts are most skeptical of.
Ed Zitron’s teardown is the most cited: Anthropic books the full amount customers pay through AWS Bedrock and Google Cloud as its own revenue, where OpenAI reports on a net basis. By Zitron’s math, when Anthropic reported $30B ARR earlier in 2026, the OpenAI-comparable figure was closer to $22B — a 25-35% gap that doesn’t disappear at $47B 7. A separate Substack analysis flags a more mechanical distortion: one enterprise customer reportedly burned $500M in a single month with no usage cap, which annualizes into a phantom $6B of ARR on its own 8. Neither finding refutes the growth — it’s clearly real — but both suggest the S-1 will show a narrower gap to OpenAI than the press release implies.
A 180-day lease, not a compute strategy
TechCrunch framed the $1.25B/month xAI arrangement as Amodei’s “capital-light” alternative to OpenAI and xAI’s own data-center buildouts. The SpaceX S-1 tells a different story: the deal is structured as a 180-day lease with a 90-day mutual cancellation notice, and SpaceX has explicitly reserved the right to claw Colossus 1 capacity back for Starship guidance or Grok if its own compute gets tight 9.
That reframes Amodei’s “we’d much prefer slightly more demand than we can serve” line. Anthropic’s marginal compute isn’t a choice — it’s contractually fragile, sitting on a rival’s hardware that can be repossessed in a quarter. It also explains the IPO urgency: public-market capital is what lets Anthropic stop renting Elon Musk’s GPUs and start owning its own.
The ROI data she didn’t engage
Amodei’s response to Uber’s public grumbling about AI productivity was that enterprises are “early” in learning to deploy the tools. The Uber numbers she’s responding to are sharper than that summary admits: the company exhausted its annual AI coding budget in four months, and its COO coined “tokenmaxxing” for developers consuming capacity without shipping more product 10.
MIT’s Project NANDA generalizes the pattern across $30-40B of enterprise GenAI spend: 95% of pilots return zero measurable ROI, and only 5% reach production with P&L impact 11. Goldman’s Jim Covello has made the institutional version of the same argument — almost all economic value so far accrues to NVIDIA, while end customers wait for returns that would justify the trillion-plus in cumulative capex 12.
What the S-1 will have to answer
The pre-IPO debate isn’t whether Anthropic is growing. It’s whether gross-basis ARR, a 90-day-cancelable compute base, and a customer cohort prone to Uber-style budget shocks can hold a near-trillion-dollar valuation through a public-market cycle. When the S-1 uncloaks, every line will be read against those three critiques — and Amodei’s “we’re still early” won’t be a sufficient answer to any of them.
OpenAI’s ChatGPT ‘Dreaming’ memory ships on vendor benchmarks
Source: openai-blog · published 2026-06-04
TL;DR
- OpenAI’s “Dreaming V3” replaces manual saved memories with background synthesis of full chat history.
- Rolled out to ChatGPT Plus and Pro in the US, with Free and Go tiers promised in weeks.
- The headline 82.8% factual recall is vendor-stated — third-party LOCOMO testing put the prior version near 52.9%.
- The “sleep-time” architecture and 5× compute cut mirror Letta’s open-source design shipped in 2025.
What actually shipped
The mechanic is the news. ChatGPT no longer relies on a searchable list of “I told it to remember X” notes — it now references entire chat histories and runs background curation to keep a working profile of preferences, projects, and constraints. OpenAI reports task-success jumps from 41.5% → 82.8% on factual recall, 31.4% → 71.3% on preference adherence, and 9.4% → 75.1% on temporal correctness across its 2024 → 2026 systems. A claimed 5× compute reduction over Dreaming V0 is what unlocks the planned expansion to Free and Go tiers.
Notably absent from the rollout: Enterprise and Education. The most data-sensitive customers are getting background profile synthesis last, inverting OpenAI’s usual sequencing.
The benchmark gap
The 82.8% number is doing a lot of work in the launch coverage, and it should not. Independent analysts flag it as “vendor-stated and directional,” with no public dataset for reproduction because the eval lives on private user data 13. When researchers ran OpenAI’s prior memory layer through the LOCOMO benchmark — a 600-turn long-conversation test — accuracy landed around 52.9%, trailing specialized systems like Mem0 14.
That doesn’t mean V3 isn’t better than V0. It means the headline delta is internal reward-model scoring, not a long-horizon recall test anyone outside OpenAI can replicate. Treat it as a trend signal until somebody re-runs LOCOMO against the new system.
Letta got there first
The “Dreaming” branding obscures a well-trodden research line. Letta — the company built on UC Berkeley’s MemGPT — has been shipping a sleep-time compute architecture that schedules reflection sub-agents to convert “raw context” into “learned context,” and reports the same roughly 5× runtime compute reduction OpenAI now claims 15. A 2026 CMU/UMD paper, Language Models Need Sleep, formalizes the consolidation-plus-dreaming split.
OpenAI’s contribution here is productization and scale across hundreds of millions of users, not architectural novelty. The open-source ecosystem is also more transparent about how the reflection loop works — which matters for the next section.
The security and deletion paradox
Background synthesis enlarges the attack surface in a specific way: every piece of third-party content the model reads is now a potential write to long-term memory. Tenable and independent researcher Johann Rehberger have already demonstrated that indirect prompt injection embedded in web pages can implant false memories that persist across conversations and exfiltrate data on every subsequent reference 16. The new Memory Summary page lets users review memories, but does not obviously block the injection vector at write time.
flowchart LR
A[Third-party web page] -->|indirect prompt injection| B{Dreaming V3<br/>background synthesis}
C[User chat history] --> B
B --> D[(Long-term memory profile)]
D --> E[Every future conversation]
E -. exfiltration on reference .-> F((Attacker endpoint))
G[Memory Summary page] -.review only.-> D
The deletion story is worse than it looks. A May 2025 U.S. court order requires OpenAI to retain chat data even when users delete it — meaning the “forget this” affordance in the new UI is partially theatrical at the storage layer 17. Power users are independently reporting “context rot,” where stale inferences pollute fresh technical sessions and push the model toward sycophancy; the documented workaround is reverting to Temporary Chat or logging out 18.
Users may believe they are managing their privacy through the new summary interface, while the underlying records remain permanently stored. 17
Dreaming V3 is a real engineering scale-up. It is also a contested benchmark, a borrowed architecture, and a privacy UX that doesn’t match its storage reality. All three should travel with the launch.
Round-ups
Trump’s AI safety testing order hits a gutted federal workforce
Source: ars-technica-ai
A new Trump executive order directs US security teams to test frontier AI models for dangerous capabilities, but critics say DOGE-driven cuts have stripped those agencies of the staff and expertise needed to actually run the evaluations.
Federal courts strain under flood of AI-drafted pro se filings
Source: mit-tech-review-ai
Judges including Colorado federal magistrate Maritza Braswell are wading through stacks of ChatGPT-assisted filings from self-represented litigants. The documents often cite fabricated cases, forcing courts to verify citations and rethink how they handle unrepresented plaintiffs.
Satya Nadella joins Latent Space and No Priors at Microsoft Build
Source: latent-space
Microsoft CEO Satya Nadella makes his first Latent Space appearance in a crossover episode with No Priors recorded at Build 2026. The conversation covers Microsoft’s AI roadmap and platform strategy heading into the next year.
Musk’s SpaceX IPO path puts him on track for trillionaire status
Source: the-verge-ai
Decoder host Nilay Patel and NYT reporter Ryan Mac unpack how a looming SpaceX IPO, plus xAI and X stakes, position Musk to become the world’s first trillionaire. Mac coauthored Character Limit, the 2024 book on Musk’s Twitter takeover.
Meta copies Tesla and houses new data centers in tents
Source: techcrunch-ai
Facing ballooning AI infrastructure costs, Meta is borrowing a Tesla manufacturing trick and erecting tent-style structures to house compute capacity. The fabric buildings go up far faster and cheaper than traditional data center shells, trimming time-to-power for GPU clusters.
Reve 2 and Ideogram 4 push layout control in image generation
Source: latent-space
Reve 2 and Ideogram 4 ship updates focused on structured layouts, giving image models tighter control over text placement and composition. Otherwise the day was quiet across the broader AI news cycle.
Latent Space’s AI news digest calls it a quiet day
Source: latent-space
AI News logs another slow news cycle with no major model drops, benchmarks, or funding announcements worth flagging. The daily digest, which typically aggregates Discord and Twitter chatter across the AI engineering community, simply marks the day as uneventful.
Footnotes
-
Crypto Briefing — coverage of the open letter and S.3741 — https://cryptobriefing.com/openai-anthropic-congress-synthetic-dna-regulation/
↩The letter explicitly backs the Biosecurity Modernization and Innovation Act (S.3741), a bipartisan bill from Senators Cotton and Klobuchar that would impose civil penalties of up to $750,000 for failures in sequence screening.
-
arXiv — Virology Capabilities Test (VCT) paper — https://arxiv.org/html/2504.16137v1
↩OpenAI’s o3 model achieved 43.8% accuracy on the VCT, exceeding the average score of PhD-level virologists (22.1%) and outperforming 94% of experts on questions specific to their own sub-specialties.
-
DrugPatentWatch — GPT-Rosalind technical review — https://www.drugpatentwatch.com/blog/gpt-rosalind-what-openais-life-sciences-model-actually-does-to-drug-development/
↩ ↩2GPT-Rosalind achieved a 0.751 score on BioXBench and outperformed human experts in 95% of RNA sequence prediction tasks for Dyno Therapeutics, while still requiring a secondary ‘Validation as a System’ pipeline to verify citations and experimental claims.
-
Singularity Hub — David Relman quoted — https://singularityhub.com/2026/06/04/ai-can-design-and-run-thousands-of-experiments-without-human-hands-we-arent-ready-for-the-risk-to-biosecurity/
↩AI can instruct users on how to ‘alter an order so that even those that are screening may be much less able to detect it,’ effectively teaching malicious actors how to mask regulated sequences as benign fragments.
-
Crypto Briefing — open-source community reaction — https://cryptobriefing.com/openai-anthropic-congress-synthetic-dna-regulation/
↩ ↩2Dissenting voices warn that labeling standard techniques like model distillation as ‘attacks’ could lead to regulations that cripple academic ecosystems and ban open-weight models under the guise of security; critics characterize the proposed mandates as ‘cartel-like gatekeeping.‘
-
Benzinga — industry rivalry framing — https://www.benzinga.com/markets/private-markets/26/06/53018078/ai-rivals-openai-google-deepmind-and-anthropic-unite-on-bioweapon-prevention
↩Critics fear these regulations will create a ‘biological moat’ that prevents smaller labs from accessing the tools needed for vaccine and drug discovery, concentrating power within a narrow group of Western institutions.
-
Ed Zitron, ‘Anthropic’s Profitability Swindle’ (wheresyoured.at) — https://www.wheresyoured.at/anthropics-profitability-swindle/
↩Anthropic records the total amount customers pay through platforms like AWS and Google Cloud as its own revenue… when Anthropic reported a $30 billion ARR in early 2026, OpenAI executives claimed the figure would drop to $22 billion if adjusted to industry-standard net reporting.
-
Sebastian Barros, Substack analysis of Anthropic IPO — https://sebastianbarros.substack.com/p/wow-antrophic-goes-public-at-1-trillion
↩A single enterprise client inadvertently spent $500 million in one month due to a lack of usage limits, which—if annualized—would artificially add $6 billion to the ARR.
-
KraneShares analysis of SpaceX S-1 — https://kraneshares.com/spacex-ipo-5-key-takeaways-from-the-s-1-filing-and-how-to-get-exposure-today-with-agix/
↩Elon Musk recently clarified that the agreement is structured as a 180-day lease with a mutual 90-day cancellation notice… allowing SpaceX the flexibility to reclaim capacity for internal projects such as Starship guidance or xAI’s Grok if compute demand becomes ‘tight’.
-
BigGo Finance recap of Uber AI budget — https://finance.biggo.com/news/0cec016fbb003d80
↩Uber reported exhausting its entire annual AI coding budget in just four months… COO Andrew Macdonald introduced the term ‘tokenmaxxing’ to describe a trend where developers consume vast amounts of AI capacity without a proportional increase in useful product output.
-
MIT Project NANDA ‘GenAI Divide’ report summary (Sundeep Teki) — https://www.sundeepteki.org/blog/the-genai-divide-why-95-of-ai-investments-fail
↩Approximately $30–40 billion has been poured into enterprise Generative AI, [yet] 95% of organizations are realizing zero measurable return on investment… only 5% of custom enterprise solutions successfully reach production with a positive impact on the P&L.
-
Goldman Sachs research summary (Magic Suite, citing Jim Covello) — https://www.magicsuite.ai/blog-articles/goldman-sachs-ai-spending-cloud-cash-flow-risk-2026
↩Almost all economic value remains trapped within semiconductor companies like NVIDIA while customers have yet to see a corresponding rise in returns… the majority of users rely on free versions, leaving a massive gap in the enterprise profitability required to justify the estimated $1 trillion in cumulative capital expenditure.
-
Digital Applied analysis — https://www.digitalapplied.com/blog/chatgpt-memory-dreaming-v3-openai-2026-guide
↩the 82.8% factual recall figure is considered vendor-stated and directional rather than a precision benchmark… these numbers should be treated as a ‘trend signal’ rather than verified technical truth
-
Investing.com / LOCOMO third-party test — https://www.investing.com/news/stock-market-news/dreaming-memory-system-rolls-out-to-chatgpt-plus-and-pro-users-93CH-4727151
↩independent testing on 600-turn conversations showed accuracy closer to 52.9% for earlier memory models, trailing behind specialized systems like Mem0
-
Letta (ex-MemGPT) engineering blog — https://www.letta.com/blog/sleep-time-compute
↩Letta treats memory like an operating system. Its ‘sleep-time compute’ triggers reflection sub-agents on a schedule to reorganize ‘raw context’ into ‘learned context’
-
Reddit r/ArtificialIntelligence + Tenable/Rehberger research — https://www.reddit.com/r/ArtificialInteligence/comments/1q3ip8f/ai_memory_features_are_rolling_out_fast_but/
↩malicious instructions hidden in third-party content can be used to implant ‘false memories’ into the user’s long-term profile without their consent… persist across all future conversations, allowing attackers to exfiltrate private data
-
Neville Hobson, ‘Privacy minefield’ — https://www.nevillehobson.io/is-chatgpts-memory-a-helpful-companion-or-a-privacy-minefield/
↩ ↩2a May 2025 U.S. court ruling requires OpenAI to preserve and segregate all chat data—even if a user attempts to delete it… users may believe they are managing their privacy through the new summary interface, while the underlying records remain permanently stored
-
Reddit r/ChatGPT power users — https://www.reddit.com/r/ChatGPT/comments/1ts7bb0/did_chatgpt_get_another_memory_update_wow_this_is/
↩power users frequently report ‘context rot,’ where outdated or irrelevant memories pollute new, specialized technical queries… the memory feature can stifle creativity by forcing the model into a narrow set of established user assumptions