Anthropic admits 40% token bug, Abridge hits Epic, OpenAI Codex trails Gemini
Every URL the pipeline pulled into ranking for this issue — primary sources plus the supporting and contradicting findings each Researcher returned. Inline citations in the issue point back here.
Sources
Claude Code’s product lead talks usage limits, transparency, and the “lean harness” arstechnica.com
“We have no grand plan,” says Anthropic’s Cat Wu—but that’s by design.
AI-Native Healthcare: 100M Doctor Visits, 10–20 Hours Saved, Prior Auth in Minutes — Janie Lee & Chai Asawa, Abridge latent.space
How Abridge is quietly turning the patient and clinician conversation into the operating system of healthcare
How business operations teams use Codex openai.com
See how business operations teams can use Codex to create initiative briefs, strategy updates, leadership decision packets, progress updates, and more from real work inputs.
How sales teams use Codex openai.com
See how sales teams can use Codex to create pipeline briefs, meeting prep packets, forecast reviews, account plans, and stalled-deal diagnoses from real work inputs.
How data science teams use Codex openai.com
See how data science teams can use Codex to build root-cause briefs, impact readouts, KPI memos, scoped analyses, and dashboard specs from real work inputs.
datasette-llm-limits 0.1a0 simonwillison.net
A new alpha plugin lets Datasette admins enforce dollar-denominated LLM budgets per user or globally. Configuration uses a rolling-24h window with an amount_usd cap, working alongside datasette-llm and datasette-llm-accountant to track and curb runaway model spending inside multi-tenant deployments.
Agents feedback tip bensbites.com
Agent-era software collapses the line between user and developer, the newsletter argues, as users increasingly configure, prompt, and wire up their own workflows. The takeaway: product teams should treat feedback loops and scripting surfaces as first-class features, not power-user afterthoughts.
QR code generator simonwillison.net
Simon Willison shipped a browser-based QR code generator built with Claude’s help, supporting plain text, URLs, and WiFi network handoffs with SSID, password, and WPA security selection. Styling options include square format, optional border, size presets, and color choice.
inaturalist-clumper 0.1 simonwillison.net
The new 0.1 release groups iNaturalist observations into JSON clumps that feed Willison’s blog sightings pipeline, already running in production for several weeks. A companion post showing a Western Gull and Rock Pigeon spotted before PyCon demonstrates the output format in action.
Western Gull, Rock Pigeon simonwillison.net
Western Gull, Rock Pigeon, in Los Angeles Area (custom), CA, US I went for a bird walk in the morning before PyCon, and we spotted a local seagull enjoying a Starbucks.
References
Anthropic engineering postmortem (April 23, 2026) anthropic.com
A ‘reasoning effort’ bug defaulted models to a lower intelligence mode and a caching regression caused the model to re-process entire conversation histories, leading to ~40% token inflation for identical tasks.
VentureBeat venturebeat.com
Anthropic cracks down on unauthorized Claude usage by third-party harnesses… blocking tools like OpenCode and OpenClaw that spoofed the Claude Code client to access more favorable subscription limits.
CTOL Digital — ‘AI leaders clash on agent architecture’ ctol.digital
Cognition’s Walden Yan published ‘Don’t Build Multi-Agents,’ arguing multi-agent systems create fragile architectures due to poor context sharing… Anthropic countered that parallel role-specialized agents can outperform single-agent baselines by over 80% on complex tasks.
r/ClaudeAI — 52 controlled benchmarks reddit.com
Claude Code’s ‘Agent Teams’ feature—which runs parallel sub-agents—was found to increase costs by over 70% without providing measurable gains in code quality; retry loops can actually degrade output by regenerating entire files and destroying previously correct sections.
InfoWorld infoworld.com
Anthropic puts Claude agents on a meter across its subscriptions… programmatic usage including the Agent SDK transitions into a dedicated monthly credit system, billed at standard API rates rather than drawing from the general chat pool.
The Hacker News thehackernews.com
Claude Code flaws (CVE-2025-59536) allow remote code execution or API key theft via malicious repository-level configuration files… exfiltrating a developer’s Anthropic API keys before showing a trust prompt.
Slashdot / Washington v. Sutter Health coverage slashdot.org
Plaintiffs allege Abridge’s AI-generated clinical notes often include boilerplate language stating the patient was informed of and consented to the recording, even in instances where no such conversation occurred — a ‘consent hallucination’ embedded in the medical record.
npj Digital Medicine study (PMC) pmc.ncbi.nlm.nih.gov
Analysis of nearly 13,000 sentences identified an overall hallucination rate of 1.47%, but 44% of those errors were ‘major’ — capable of affecting diagnosis or patient management, including invented ECGs/mammograms and misattributed family-history symptoms.
HIT Consultant on Epic AI Charting launch hitconsultant.net
Epic launched native AI Charting in February 2026 integrated directly into its ‘Art for Clinicians’ suite, pulling context from the longitudinal record — a ‘watershed moment’ that risks Sherlocking the ambient-AI market.
Business Insider — ‘messiest relationship in healthcare AI’ businessinsider.com
Epic divested its single-digit stake in Abridge in 2025 as it prepared to launch its own internal AI scribe, after Abridge engineers had spent significant time on Epic’s campus co-developing features.
AICerts analysis of Abridge’s Series E aicerts.ai
The $5.3B valuation implies a ~50x forward multiple on ~$100–117M contracted ARR — ‘unhinged’ for a documentation tool when the physician-scribe TAM is estimated at only $1.6–2.1B, forcing a pivot into the $250B RCM market to justify the price.
Reddit r/medicine / r/biotechmarketers practitioner threads reddit.com
Clinicians describe Abridge output as ‘clinical slop’ — the A&P ‘sounds like a medical student,’ the AI occasionally fabricates exam findings or ignores stated negatives, and time saved on drafting is often lost to intensive proofreading.
BankInfoSecurity bankinfosecurity.com
a now-patched vulnerability that allowed attackers to steal GitHub credentials through Codex’s code review feature
Google Cloud Blog — Gemini Enterprise Agent Platform cloud.google.com
Gemini Enterprise Agent Platform… long-running agents capable of autonomous sales prospecting sequences lasting several days
Anthropic financial-services GitHub repo github.com
15 pre-built agentic workflows for sales, operations, and finance that integrate directly with HubSpot, Salesforce, and QuickBooks
Zack Proser — OpenAI Codex Review 2026 zackproser.com
a ‘mediocre SME’ providing thin context will produce mediocre results, as AI models primarily pattern-match rather than reasoning from first principles
Forbes — Victor Dey forbes.com
OpenAI says Codex agents are running its data platform autonomously
Medium — ‘Are Codex and Claude becoming everything platforms for work?’ medium.com
the ‘Copywriter Pattern,’ which allows non-technical staff to ship product changes without engineering intervention… successful users manage 10+ autonomous tasks simultaneously rather than perfecting a single prompt