Datasette wires fixtures, Ronacher cuts bugs to 4 lines, Claude ports a 1983 PDF

TL;DR

Datasette 1.0a30 ships agent hooks and a fixture helper as one coordinated drop.
Text-to-SQL still tops out at 60-70% on realistic enterprise schemas, making fixtures load-bearing.
Ronacher wants bug reports compressed to 4 lines after curl saw 8× volume with under 5% valid.
ZeroPath landed ~170 verified curl fixes in a month using AI static analysis instead.
Claude Artifacts reconstructed Usborne’s 1983 Mad House as playable JS from the scanned PDF.

Three unrelated AI-tech ships today, each with its own asterisk in the same writeup. Datasette 1.0a30 lands as agent infrastructure — a jump_items_sql() hook and a fixture-database helper that the new datasette-agent plugin immediately consumes — but independent reviewers flag the lethal trifecta (private data, untrusted input, code-exec plugins), and the reason datasette-fixtures matters is that text-to-SQL benchmarks still sit at 60-70% on realistic enterprise schemas.

The other two are about what humans put in and get out. Armin Ronacher wants bug reports cut to four lines — command, expected, actual, verbatim error — pointing at curl’s collapse from 15% valid reports to under 5% as LLM-generated noise floods in, while ZeroPath lands ~170 verified curl fixes the same month using AI on the other side of the pipe. And Simon Willison rebuilt Usborne’s 1983 Mad House from a scanned PDF in one Claude Artifacts prompt — legally licensed, technically charming, and a useful reminder that the original listings were already a multi-platform abstraction, not a canonical program to be faithful to.

Datasette 1.0a30 lays the plumbing for an LLM SQL agent

Source: simon-willison · published 2026-05-24

TL;DR

Datasette 1.0a30 shipped alongside datasette-agent and datasette-fixtures as a single coordinated drop, not a UI release.
The new jump_items_sql() hook and a fixture-database helper are scaffolding the agent immediately consumes.
Independent reviewers flag the “lethal trifecta” — private data, untrusted input, and code-exec plugins like datasette-agent-sprites.
Text-to-SQL benchmarks still sit at 60–70% on realistic enterprise schemas, which is why datasette-fixtures matters more than it looks.

Three releases, one architecture

Read the May 24 trio as one move. Datasette 1.0a30’s headline UI feature is the new “Jump to” menu, but the load-bearing addition is the jump_items_sql() plugin hook and a populate_fixture_database() helper ¹² — both of which the freshly-released datasette-agent and datasette-fixtures plugins consume on day one. Simon Willison’s own Substack write-up is more candid than the release notes about what’s being built: a “pelican sighting” demo answers questions over his blog database, and the Markdown export of the agent’s SQL plus results is pitched explicitly as a reproducibility / audit story ³. The official datasette.io agent post hammers the plugin-first framing — model-agnostic across GPT-5.5 and Gemini 3.5 Flash, with datasette-agent-charts and datasette-agent-sprites already shipping as proof the hook surface is real ⁴.

In other words: 1.0a30 isn’t really a UI release. It’s the permission, hook, and fixture plumbing that lets Datasette host agents at all.

The lethal trifecta lands at Datasette’s door

Independent coverage is noticeably less enthusiastic than Willison’s own posts. PopAI Explorer parks datasette-agent squarely inside the “lethal trifecta” of agent security and singles out datasette-agent-sprites, which executes code in a Fly Sprites sandbox, as a non-trivial blast radius if prompt injection lands ⁵.

flowchart LR
    A[Private SQLite data] --> B{datasette-agent}
    C[Untrusted prompt / web content] --> B
    B --> D[datasette-agent-sprites<br/>Fly sandbox code exec]
    B -. answers, exports .-> E((External output))

The jump_items_sql hook itself is at least gated on Datasette’s existing execute-sql permission, so menu queries inherit the read-only actor model ². But that model predates LLM callers by years. Sitting next to it in the same release is a quieter security change worth flagging: token-based CSRF protection has been replaced with Sec-Fetch-Site header checks, citing Filippo Valsorda’s research and Go 1.25 ¹. The templates get simpler; the trust boundary shifts to the browser.

Why fixtures matter more than they look

The agent pitch — point an LLM at SQLite, get answers — collides with current benchmarks. AIMultiple’s 2026 numbers put frontier models at 90–95% on toy schemas but 60–70% on realistic enterprise ones, with “faulty joins” and “hallucinated column names” as the dominant failure modes ⁶.

That recontextualizes datasette-fixtures. A shared, well-characterised schema isn’t just a developer convenience — it’s the only honest way to tell whether an agent plugin works or merely demos well. Shipping it the same day as the agent suggests Willison knows the accuracy story is the weak flank.

The takeaway

The plumbing is well-designed; the open question is whether any amount of plumbing is sufficient when the caller is a non-deterministic model with tool access.

1.0a30 is Datasette quietly becoming an agent host. The hooks are clean, the permission model is reused rather than reinvented, and the fixture story hints at a real benchmarking posture. What it doesn’t solve — and arguably can’t, at the framework layer — is the trifecta. Anyone wiring datasette-agent to a database with real secrets in it should treat sprites-style plugins as the threat model, not a footnote.

Ronacher: condense bug reports to 4 lines, skip the LLM

Source: simon-willison · published 2026-05-24

TL;DR

Armin Ronacher wants bug reports compressed to 4 lines: command, expected result, actual result, verbatim error.
curl saw 8× the historical submission volume while the valid-report rate collapsed from 15% to under 5%.
Rust’s moderation team now classifies verbose LLM-generated messages as spam, calling reviewers “unpaid QA for hallucinations”.
ZeroPath landed ~170 verified curl fixes in one month using an AI static analyzer — same tooling, opposite outcome.

The four-line bug report

Armin Ronacher’s complaint about “slop issues” filed against Pi reads like a personal vent, but the prescription is sharper than the rant. He wants reporters to stop laundering observations through an LLM and instead submit the irreducible core: (1) the command, (2) the expected outcome, (3) what actually happened, (4) the exact log or error. Everything else — speculated root causes, “fake-minimal repros,” analogies to unrelated code, padded lists of possibly-related error classes — is noise that the maintainer has to disprove before they can start debugging ⁷.

The framing matters. Ronacher isn’t banning AI assistance; he’s banning the asymmetry where a contributor spends thirty seconds prompting and a maintainer spends an hour refuting confident nonsense.

This isn’t one maintainer’s bad week

Ronacher’s post is a node in a documented collapse of the bug-report commons. The canonical data point is curl: per The New Stack, submissions ran roughly 8× the historical average while the valid-report rate fell from 15% to under 5%, ultimately driving Daniel Stenberg to shut down curl’s six-year HackerOne bounty program in January 2026 ⁸. Rust formalized the same intuition into written policy, treating verbose LLM-generated messages as spam on the explicit grounds that they conscript reviewers as “unpaid QA for hallucinations” ⁹. The Linux Foundation has now committed $12.5M to fund triage automation for maintainers buried in AI-generated reports ¹⁰.

The vocabulary is load-bearing. Simon Willison coined “slop” in May 2024 as a deliberate analog to spam — AI output published without human review, shifting the cost from writer to reader ¹¹. Two years later that framing is institutional.

The counter-evidence Ronacher elides

generation is cheap but review is prohibitively expensive ⁷

The picture isn’t uniform. ZeroPath claims it landed roughly 170 valid bug fixes in curl in a single month using an AI-native static analyzer ¹² — in the same project that became the poster child for slop. Stenberg himself distinguishes “slop” from disciplined AI-assisted research like Joshua Rogers’. The dividing line isn’t AI vs. no-AI; it’s whether a human verified the output before pressing submit.

Ronacher’s template implicitly accepts that. A reporter who runs four lines past an LLM to clean up grammar is fine. A reporter who pastes a one-line symptom into a chatbot and submits whatever comes back — root-cause guesses and all — is not. The four-line format makes the difference legible: there’s nowhere for an LLM to insert speculation if the schema only has slots for observed facts.

What’s actually at stake

The convergence — Rust’s policy, curl’s shutdown, the LF’s triage fund, Ronacher’s template — points at a structural shift in OSS economics. When generation costs collapse and review costs don’t, the bottleneck moves to coordination: who gets to put bytes into a maintainer’s queue, and what shape do those bytes have to take. The four-line bug report is one early answer. It won’t be the last.

Claude rebuilds a 1983 Usborne game from its PDF in one prompt

Source: simon-willison · published 2026-05-24

TL;DR

Simon Willison turned Usborne’s 1983 Mad House PDF into a playable vanilla-JS port via one Claude Artifacts prompt.
Usborne explicitly licenses these adaptations — readers may port the programs to modern languages and republish, with credit.
Hand-ported precedents already exist in Node and Python — the novelty is method, not artifact.
Fidelity is softer than it looks: original listings were a multi-platform abstraction, not a single canonical program.

A weekend hack on a well-trodden lane

Willison pointed Claude at the scanned PDF of Usborne’s 1983 Creepy Computer Games, asked for “a vanilla JS artifact that exactly recreates the game Mad House… mobile friendly and… retro aesthetic,” and shipped a playable green-on-black version the same afternoon. No manual transcription, no BASIC-to-JS rewrite — just the book as input and a working browser game as output.

The interesting thing is that this isn’t a legal grey zone or a lone experiment. Usborne’s own terms explicitly invite readers to “adapt any of the programs in these books to modern computer languages” and republish them with credit ¹³ — which is precisely the link-and-attribute pattern Willison used. And a small ecosystem of hand-ported preservation projects already exists: filscentia/spacegames reinterprets Computer Spacegames into Node.js and Python for the command line ¹⁴, with similar repos covering BBC Micro and Spectrum variants. What changed isn’t the artifact. It’s that an LLM compressed weeks of hobbyist retyping into one prompt cycle.

What “exactly recreates” actually means

The Mad House listing was never a single-target program. Usborne shipped a ZX81 base listing with a symbol system — triangles for VIC-20, squares for BBC Micro, other glyphs for Apple II, TRS-80 and Oric — printed next to the lines you’d swap for your machine ¹⁵. Claude was reading a deliberately abstracted multi-platform document, so “exactly” is doing more work than it should. Likewise, the original books’ charm came from colourful illustrations and personified robots that explained loops and memory in a register modern markdown can’t reach ¹⁶; a terminal palette captures the era’s vibe, not its pedagogy.

None of this makes the port worse — it just locates the win. Claude is doing semantic translation of game logic from prose-plus-code, not literal line-by-line conversion, and the result runs without an emulator on a phone. That’s the actual capability claim.

The Artifacts pattern, with rough edges

Mad House is one node in Willison’s broader “hoard” of 200+ single-file HTML tools built via Claude Artifacts, including a documented “Claudeception” trick where artifacts call window.claude.complete() to spawn further LLM calls from inside the generated app ¹⁷. The pattern has real reach. It also has documented friction: users report limited output context truncating code mid-game and versioning glitches where the Artifacts UI silently deletes large sections during iteration ¹⁸. Mad House’s tight game loop fits comfortably inside the budget. Something like Mystery of Silver Mountain’s adventure listings would hit those ceilings hard.

Why it matters

The headline isn’t “AI does cute thing with old book.” It’s that a publisher-sanctioned preservation lane that previously required a hobbyist willing to retype 400 lines of BASIC now takes one prompt and a PDF. That changes the unit economics of which 1980s software gets revived — and which gets quietly forgotten because nobody had the afternoon to spare.

Datasette docs — 1.0a30 changelog — https://docs.datasette.io/en/latest/changelog.html

Replaced traditional token-based CSRF protection with the modern Sec-Fetch-Site header approach… inspired by Go 1.25 and research by Filippo Valsorda… eliminates the need for hidden tokens in HTML templates. Also introduced RenameTableEvent so plugins like datasette-comments stay synchronized when tables are renamed.

↩ ↩²
Datasette latest docs — jump_items_sql hook — https://docs.datasette.io/en/latest/

Implementations typically verify if the current ‘actor’ has the execute-sql permission before running the query to populate the menu… designed to respect the underlying SQLite read-only constraints, ensuring that menu-generation queries cannot be used for unauthorized data modification.

↩ ↩²
Simon Willison Substack — ‘Datasette Agent, an AI assistant’ — https://simonw.substack.com/p/datasette-agent-an-ai-assistant-for

The agent successfully queried a blog database to find his most recent ‘pelican sighting’… the ‘export Markdown’ feature was particularly highlighted as a way to maintain a reproducible audit trail of the agent’s SQL logic and results.

↩
datasette.io blog — Datasette Agent announcement — https://datasette.io/blog/2026/datasette-agent/

Datasette Agent supports a wide array of backends, including OpenAI’s GPT-5.5 and Google’s Gemini 3.5 Flash… ‘plugin-first’ architecture… datasette-agent-charts and datasette-agent-sprites plugins as evidence that the system is a viable platform for ‘vibe coding’.

↩
PopAI Explorer — Datasette Agent coverage — https://www.popaiexplorer.com/en/news/2026-05-21-datasette-agent-13861f74

Because datasette-agent can execute SQL and potentially run code via plugins like datasette-agent-sprites (which uses a Fly Sprites sandbox), it resides in a high-risk category for prompt injection… the ‘lethal trifecta’ of agent security: access to private data, exposure to untrusted content, and the ability to communicate externally.

↩
AIMultiple — Text-to-SQL benchmarks 2026 — https://aimultiple.com/text-to-sql

Top-tier models (Claude Sonnet 4.5 and GPT-5) achieve 90-95% accuracy on simple schemas… a significant drop to 60-70% on complex enterprise datasets. Common failure modes include ‘faulty joins’ and ‘hallucinated column names’.

↩
Armin Ronacher — ‘Pi OSS’ (lucumr.pocoo.org) — https://lucumr.pocoo.org/2026/5/24/pi-oss/

the issue tracker is increasingly serving as a prompt input for AI agents… generation is cheap but review is prohibitively expensive

↩ ↩²
The New Stack — curl vs HackerOne — https://thenewstack.io/curl-fights-a-flood-of-ai-generated-bug-reports-from-hackerone/

submissions had spiked to eight times the historical average, while the valid-report rate plummeted from 15% to less than 5%

↩
Rust moderation team spam policy (GitHub) — https://github.com/rust-lang/moderation-team/blob/main/policies/spam.md

verbose, LLM-generated messages [are treated] as a form of spam because they force reviewers to act as ‘unpaid QA’ for hallucinations

↩
OpenSourceForU — Linux Foundation AI defense initiative — https://www.opensourceforu.com/2026/03/linux-foundation-launches-open-source-defence-against-ai-bug-surge/

Linux Foundation launched a $12.5 million initiative to fund ‘AI defense’ tools designed to help maintainers automate the triage of incoming reports

↩
Simon Willison — ‘Slop is the new name for unwanted AI-generated content’ (May 2024) — https://simonwillison.net/2024/May/8/slop/

slop is AI-generated content that is shared without human oversight or regard for the recipient’s time; ‘don’t publish slop’ is a baseline for AI ethics

↩
ZeroPath blog — 170 valid bugs in curl — https://zeropath.com/blog/how-zeropath-won-over-curl-with-170-valid-bugs

curl received nearly 170 verified bug fixes from AI tools in a single month

↩
Usborne Publishing — Computer and Coding Books — https://usborne.com/us/books/computer-and-coding-books

Usborne Publishing has released the majority of its 1980s computing catalogue as free PDFs and explicitly authorises readers to ‘adapt any of the programs in these books to modern computer languages’ and share those adaptations freely online.

↩
filscentia/spacegames (GitHub) — https://github.com/filscentia/spacegames

Reinterprets the Usborne Computer Spacegames book into Node.js and Python, designed to run from the command line — part of a small ecosystem of manual, hand-ported preservation projects that predate LLM-assisted recreations.

↩
Retro Computing Forum — Usborne 1980s computer books thread — https://retrocomputingforum.com/t/usborne-1980s-computer-books/275

The original listings were architected for portability via a symbol-based system — triangles for VIC-20, squares for BBC Micro — printed next to lines requiring modification, so a single ZX81-base listing could be retyped on Spectrum, Apple II, TRS-80 or Oric.

↩
BoingBoing — ‘Usborne releases free PDFs of its 1980s computer books’ — https://boingboing.net/2016/02/07/usborne-releases-free-pdfs-of.html

The books’ heavy use of colourful illustrations and personified robots to explain memory and loops kept readers engaged, a sharp contrast to the ‘stranglehold of markdown’ in modern technical guides.

↩
Simon Willison — ‘Everything I built with Claude Artifacts’ (Substack) — https://simonw.substack.com/p/everything-i-built-with-claude-artifacts

Documents over 200 single-file HTML tools built via the Artifacts pattern, including a ‘Claudeception’ technique where artifacts call window.claude.complete() to orchestrate further LLM requests from inside the generated app.

↩
BiggGo — ‘Claude Artifacts User Issues’ — https://biggo.com/news/202507160714_Claude_Artifacts_User_Issues

Users report ‘limited output context’ truncating code mid-game and ‘versioning glitches’ where the Artifacts UI fails to update or silently deletes large sections during iteration — a known failure mode for longer retro recreations.

↩

Datasette wires fixtures, Ronacher cuts bugs to 4 lines, Claude ports a 1983 PDF

Datasette wires fixtures, Ronacher cuts bugs to 4 lines, Claude ports a 1983 PDF

TL;DR

Datasette 1.0a30 lays the plumbing for an LLM SQL agent

TL;DR

Three releases, one architecture

The lethal trifecta lands at Datasette’s door

Why fixtures matter more than they look

The takeaway

Further reading

Ronacher: condense bug reports to 4 lines, skip the LLM

TL;DR

The four-line bug report

This isn’t one maintainer’s bad week

The counter-evidence Ronacher elides

What’s actually at stake

Claude rebuilds a 1983 Usborne game from its PDF in one prompt

TL;DR

A weekend hack on a well-trodden lane

What “exactly recreates” actually means

The Artifacts pattern, with rough edges

Why it matters

Jack Sun, writing.

Datasette wires fixtures, Ronacher cuts bugs to 4 lines, Claude ports a 1983 PDF

TL;DR

Datasette 1.0a30 lays the plumbing for an LLM SQL agent

TL;DR

Three releases, one architecture

The lethal trifecta lands at Datasette’s door

Why fixtures matter more than they look

The takeaway

Further reading

Ronacher: condense bug reports to 4 lines, skip the LLM

TL;DR

The four-line bug report

This isn’t one maintainer’s bad week

The counter-evidence Ronacher elides

What’s actually at stake

Claude rebuilds a 1983 Usborne game from its PDF in one prompt

TL;DR

A weekend hack on a well-trodden lane

What “exactly recreates” actually means

The Artifacts pattern, with rough edges

Why it matters

Footnotes

Jack Sun, writing.