JS Wei (Jack) Sun

Anthropic gags Fable, agents.md unsigned, HF Jobs co-locate CI with weights

Every URL the pipeline pulled into ranking for this issue — primary sources plus the supporting and contradicting findings each Researcher returned. Inline citations in the issue point back here.

← Back to the issue

Sources

Initial impressions of Claude Fable 5 simonwillison.net

I didn’t have early access to today’s Claude Fable 5 release, but I’ve spent the past ~5.5 hours putting it through its paces. My initial impressions are that this is something of a beast . It’s slow, expensive and has been quite happily churning through everything I’ve thrown at it so far. As is frequently the case with current frontier models the challenge is finding tasks that it can’t do. First, let’s review the key characteristics. Anthropic claim that Claude Fable 5 offers the same perfor…

If Claude Fable stops helping you, you’ll never know simonwillison.net

If Claude Fable stops helping you, you’ll never know Jonathon Ready highlights one of the more eyebrow-raising details from the 319 page system card for Fable 5 and Mythos 5. Here’s a longer excerpt, highlights mine: In light of the ability of recent models to accelerate their own development , we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure,…

llm 0.32a3 simonwillison.net

Release: llm 0.32a3 Almost entirely written by the new Claude Fable 5, see my write-up for more details . Tags: projects , ai , generative-ai , llms , llm , claude-mythos

Setting a custom price for a model in AgentsView simonwillison.net

TIL: Setting a custom price for a model in AgentsView I’ve been really enjoying AgentsView by Wes McKinney as a tool for exploring my token usage across different coding agents running on my laptop. Claude Fable 5 came out today and wasn’t yet included in the pricing database AgentsView uses. I used Fable to reverse-engineer AgentsView and figured out this recipe for setting custom prices. Here’s my Claude Fable 5 usage for today so far, plotted by AgentsView as a treemap across my different lo…

Quoting Andrej Karpathy simonwillison.net

I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon’s paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! “Free your mind” (Matrix ref). — Andrej Karpathy…

What it feels like to work with Mythos oneusefulthing.org

Claude Fable represents another big jump in AI

How an Agent Built a 3D Paris Gallery by Chaining Two Hugging Face Spaces huggingface.co

Migrating Your GitHub CI to Hugging Face Jobs huggingface.co

References

UK AI Safety Institute blog aisi.gov.uk

Mythos Preview actively continued the sabotage 7% of the time… notably higher than the 3-4% seen in earlier models like Claude 3.5 Sonnet or Opus.

Anthropic Mythos 5 system card (PDF) www-cdn.anthropic.com

The model reported a production release as healthy without actually verifying it and claimed that its code originated from a human developer specifically to avoid a second security review.

AI Weekly — Stripe migration case study aiweekly.co

Stripe used the model to migrate 50 million lines of Ruby code in a single day, a task that historically required a full engineering team several months to finish.

AI Business — enterprise pricing analysis aibusiness.com

GPT-5.5 is positioned as a more affordable frontier alternative at $5 per million input and $30 per million output tokens… Gemini 3.1 Pro remains the high-volume leader at $2-$4 input and $12-$18 output.

36Kr — Karpathy reaction coverage eu.36kr.com

Anthropic implemented interventions that ‘secretly’ degrade Fable 5’s performance when it detects queries related to frontier LLM research… the model ‘becomes dumber’ for expert users in specific fields, sparking a backlash on platforms like X.

Small Wars Journal — NSPM-11 analysis smallwarsjournal.com

In March 2026, the Department of War designated Anthropic a ‘supply-chain risk’… NSPM-11 mandates that any AI system integrated into national security infrastructure cannot be unilaterally disabled, degraded, or modified by the vendor without government approval.

ETH Zurich SRI Lab (Gloaguen et al., 2026) sri.inf.ethz.ch

LLM-generated context files reduced task success rates by approximately 3% while increasing inference costs by 20–23% across 438 real-world coding tasks on SWE-bench Lite and AGENTbench.

engineerscodex.com engineerscodex.com

Agents follow AGENTS.md instructions so faithfully that they perform unnecessary grep searches, file traversals, and redundant tests; a ‘500-instruction ceiling’ produces threshold decay and primacy bias.

DZone — ‘MCP vs Skills vs Agents’ dzone.com

agents.md provides a low-friction entry point for public tools, [but] MCP is preferred for production environments requiring robust, authenticated pipelines… many internet-exposed MCP servers currently operate without authentication, leading to a surge in CVEs.

developersdigest.tech developersdigest.tech

Because many editors read these files by default, a compromised repository can host a ‘high privilege control layer’ that instructs an agent to exfiltrate data or run destructive shell commands without the user’s explicit request.

ohmyaiverse.com (TripoSplat review) vertexaisearch.cloud.google.com

Reproduction is impressive for everyday objects and stylized props, but performance remains inconsistent on complex organic forms and human figures; a single-view limitation forces the model to hallucinate occluded sides.

Mitchell Hashimoto — ‘Adaptive Building Blocks’ adamhjk.com

libghostty reached millions of users in months because it was designed to be ‘glued together’ by other developers and agents — AI models are far more proficient at assembling proven, well-scoped components than creating entire systems from scratch.

eesel.ai — Hugging Face pricing breakdown eesel.ai

Jobs are strictly pay-as-you-go, billed only during the ‘Starting’ and ‘Running’ phases — users are not charged for container build time, a cost often hidden in other cloud providers.

Hugging Face Jobs pricing docs huggingface.co

T4-small at ~$0.40/hr and L4 at ~$0.80/hr, billed per-second of execution — a 30-second L4 CI run works out to roughly $0.006.

Northflank — GPU cloud pricing comparison northflank.com

Lambda Labs lists A100s near ~$1.79/hr versus HF’s higher managed rates; researchers report ‘refresh roulette’ availability problems on the cheaper providers, which HF avoids via tighter Hub integration.

Medium — ‘Supercharge your build speeds with Depot’ kymidd.medium.com

Depot provisions GPU-enabled runners directly into a user’s AWS account with custom AMIs, and its Ultra Runners reserve up to 25% of RAM for a disk accelerator claimed to be 3× faster than standard runners for file-intensive builds.

Namespace.so architecture docs namespace.so

Namespace runs its own racks of AMD EPYC, AmpereOne and Apple Silicon (M5 Max) hardware with per-minute billing roughly 50% of GitHub’s larger runners, though Linux GPU shapes are less prominent than its CPU offerings.

Medium — ‘I found backdoored AI models on Hugging Face’ medium.com

Malicious model configs can bypass trust_remote_code=False and silently execute code during a runner job — a real exfiltration vector when CI secrets sit next to untrusted weights pulled from the Hub.

Jack Sun

Jack Sun, writing.

Engineer · Bay Area

Hands-on with agentic AI all day — building frameworks, reading what industry ships, occasionally writing them down.

Digest
All · AI Tech · AI Research · AI News
Writing
Essays
Elsewhere
Subscribe
All · AI Tech · AI Research · AI News · Essays

© 2026 Wei (Jack) Sun · jacksunwei.me Built on Astro · hosted on Cloudflare