Willison ships datasette-agent —unsafe, gates Cloudflare on one ampersand

TL;DR

datasette-agent 0.3a0 lets the LLM run INSERT, UPDATE, and CREATE TABLE from natural-language prompts.
--unsafe flag suppresses the approval UI while leaving Datasette’s row-level ACLs enforced.
65% of organizations hit an AI-agent security incident in the past year, per cited industry data.
One-line Cloudflare WAF rule challenges only /search/* queries containing an &, passing single ?q= lookups instantly.
Any scraper that drops the & walks past the heuristic, which Willison labels honest UX rather than defense.

Today’s AI-tech reads both come from Simon Willison, and both ship a guardrail the author tells you up front isn’t really one. datasette-agent 0.3a0 hands the LLM execute_write_sql and adds an --unsafe flag that suppresses the approval UI — Willison is explicit that this puts the project on the wrong side of the lethal trifecta by default-capability. The Cloudflare post is the same move at a smaller scale: a one-line WAF rule that lets ?q=lemur through unchallenged and gates everything else, which any scraper bypasses by dropping the &.

The pattern is worth naming because the alternative — silent defense-in-depth that the author privately knows is leaky — is what most launch posts actually ship. Calling the flag --unsafe and calling the WAF rule honest UX, not defense makes the trade-off legible to whoever inherits the config.

datasette-agent 0.3 lets the LLM write to your database

Source: simon-willison · published 2026-06-15

TL;DR

datasette-agent 0.3a0 adds execute_write_sql, letting the agent INSERT, UPDATE, and create tables from natural-language prompts.
Datasette’s insert-row/update-row ACLs stay enforced even under --unsafe — the flag only suppresses the approval UI.
The new --unsafe flag puts the project on the wrong side of the “lethal trifecta” by default-capability, if not default-configuration.
65% of organizations hit an AI-agent security incident in the past year, 61% of those leaking sensitive data.

One new tool, one real threshold crossed

On paper, datasette-agent 0.3a0 is a point release: one new tool (execute_write_sql), three new CLI flags (--root, --yes, --unsafe), and a terminal that now renders approval prompts. In practice it is the moment Simon Willison’s own agent project enters the threat model he has been documenting for over a year.

Willison coined “lethal trifecta” for agents that combine private-data access, untrusted-content exposure, and the ability to communicate externally or change state ¹. Until 0.2a0, datasette-agent had two of three — it could read your data and ingest arbitrary user prompts, but it couldn’t mutate anything. execute_write_sql closes the triangle.

flowchart LR
    A[Private DB rows] --> B{datasette-agent + LLM}
    C[Untrusted prompts<br/>or row content] --> B
    B --> D[execute_write_sql]
    D -. mutates .-> A
    D -. exfiltrates via writes .-> E((External state))

HiddenLayer puts the shift bluntly: once an LLM is wired to a tool that mutates a database, “the LLM effectively becomes a code execution primitive rather than just a text generator” ².

What the safety model actually does

The implementation is better than the release notes advertise. Writes route through Datasette’s existing /-/execute-write endpoint as the requesting actor, and granular insert-row, update-row, and delete-row permissions are checked per statement. Critically, those ACL checks remain active even under --unsafe — the flag only suppresses the human-approval UI, not the underlying authorization layer ³. That’s a meaningfully better architecture than the common “give the agent a DB connection string and pray” pattern.

The approval dialog itself is well-designed: it shows the parameterized SQL, the bound values, the target table, and the exact permissions the operation will exercise before the user clicks Yes.

Where it doesn’t hold

The structural problem isn’t datasette-agent’s; it’s every write-capable agent’s. Keysight’s analysis is uncomfortable reading: LLMs have no parameterized-query equivalent, so instructions and data share a single channel. Any agent with write access is functionally “a God User” reachable through indirect prompt injection in row content, search results, or imported CSVs ⁴.

A confirmation dialog mitigates this until approval fatigue sets in — which is exactly what --yes and --unsafe are built to enable. The flag taxonomy mirrors the rest of the market: Claude Code has --dangerously-skip-permissions, Cursor has “Yolo Mode.” The 2026 enterprise pattern has already moved past blanket bypass toward dontAsk allowlists that restrict agents to a narrow set of pre-approved operations ⁵. Willison’s --unsafe naming is at least more honest about the trade than Anthropic’s euphemism, but the underlying knob is the same.

The Kiteworks numbers suggest the risk is no longer theoretical: 65% of surveyed organizations hit an AI-agent security incident in the prior year, and 61% of those incidents leaked sensitive data ⁶.

The takeaway

datasette-agent 0.3a0 is the rare release where the diff is small and the posture shift is large. The permission grounding is genuinely better than peer tools, and the per-statement approval UI is the right default. But “right default” and “what power users will actually run” are different questions — and the moment datasette agent chat content.db --unsafe becomes the muscle-memory invocation, the trifecta is fully armed.

Willison gates Cloudflare CAPTCHA on a single ampersand

Source: simon-willison · published 2026-06-16

TL;DR

One-line WAF rule restricts Cloudflare’s Managed Challenge to /search/* URLs whose query string contains an &, so ?q=lemur loads instantly.
Single-parameter ?q= queries get a pass; multi-facet combinations look like crawlers spidering the URL space.
Cloudflare’s MCP server couldn’t edit WAF rules, so Claude Code fell back to the raw Cloudflare API.
Honest UX, not defense — any scraper that drops the & walks past the heuristic unchallenged.

The rule

Simon Willison’s faceted search engine was protected by a blanket Cloudflare Managed Challenge on /search/*, which kept crawlers out but also forced humans to clear a CAPTCHA for a one-word query. His fix is a single WAF expression:

(http.request.uri.path wildcard r"/search/*"
 and http.request.uri.query contains "&")

Anything with one query parameter passes through. Anything combining facets — ?q=term&category=tech&year=2024, the shape a crawler generates when it enumerates the product of every filter — gets challenged. The combinatorial blowup of faceted URLs is the actual resource problem; the ampersand is a cheap proxy for “this request is exploring the cross-product.”

Worth noting for anyone tempted to extend the pattern: Cloudflare’s newer wildcard operator evaluates the entire field, requires raw-string syntax to avoid escaping pain, and rejects two consecutive unescaped asterisks as invalid ⁷. contains "&" is the right primitive here; wildcard r"*&*" is not a drop-in upgrade.

Claude Code as infra glue, with a caveat

The interesting subplot is the tooling. Willison reached for the Cloudflare MCP server from Claude Code first, found it couldn’t modify WAF custom rules, and had Claude Code switch to direct Cloudflare API calls instead. That fallback worked — the rule shipped — but it fits a documented friction pattern between Claude Code and Cloudflare. The reverse case also bites users: Cloudflare’s own managed WAF rules routinely flag MCP tool arguments containing curl invocations or raw HTTP as malicious, surfacing as opaque Error POSTing to endpoint failures, with base64-encoding the payload as the community bypass ⁸.

The takeaway isn’t that MCP is broken; it’s that the MCP surface for any given vendor is a strict subset of the API, and “have the agent fall back to curl” remains the load-bearing escape hatch.

A UX patch, not a defense

The rule does what Willison claims: it removes CAPTCHA friction for the 99% case while keeping the expensive facet-combinatorics path challenged. As security, though, it’s thin. Malicious bots are roughly 37% of web traffic, routinely ignore robots.txt, and parameter-count or character-presence heuristics are increasingly treated as outdated signature-based defenses that AI-driven crawlers trivially emulate ⁹. A motivated scraper who reads this post simply iterates ?q= values one at a time and never sends an &.

The durable version of this setup pairs the ampersand carve-out with rate limiting on /search/* and verified-bot skip rules via cf.client.bot, leaving the character heuristic as a coarse first filter rather than the security boundary. Read as ergonomics — “stop CAPTCHA-ing my readers” — the rule is a clean win. Read as bot mitigation, it’s the easy half.

Simon Willison — lethal-trifecta tag index — https://simonwillison.net/tags/lethal-trifecta/

An AI agent with access to private data, exposure to untrusted content, and the ability to communicate externally or change state creates a catastrophic risk profile.

↩
HiddenLayer research — ‘The Lethal Trifecta and how to defend against it’ — https://www.hiddenlayer.com/research/the-lethal-trifecta-and-how-to-defend-against-it

When wired to tools that can run queries or write to filesystems, the LLM effectively becomes a code execution primitive rather than just a text generator.

↩
datasette.io — SQL write queries blog post — https://datasette.io/blog/2026/sql-write-queries/

On Datasette 1.0a20 and later, INSERT triggers a check for the insert-row permission on the target table, while UPDATE or DELETE require update-row or delete-row respectively — even in —unsafe mode the underlying actor checks remain active.

↩
Keysight — DB-query-based prompt injection — https://www.keysight.com/blogs/en/tech/nwvs/2025/07/31/db-query-based-prompt-injection

LLMs lack a parameterized-query equivalent; they cannot strictly separate instructions from data, so an agent granted write access effectively becomes a ‘God User’.

↩
nxcode.io — Claude Code vs Cursor 2026 — https://www.nxcode.io/resources/news/claude-code-vs-cursor-which-is-better-2026

Claude Code implements YOLO mode through the —dangerously-skip-permissions flag… enterprise adoption in 2026 often mandates ‘dontAsk’ modes for CI pipelines, which restrict agents to a narrow allowlist.

↩
Kiteworks — AI agent security incidents 2026 — https://www.kiteworks.com/cybersecurity-risk-management/ai-agent-security-incidents-2026/

65% of organizations surveyed experienced a cybersecurity incident involving AI agents in the preceding year, with 61% of those involving sensitive data exposure.

↩
Cloudflare blog — Wildcard rules — https://blog.cloudflare.com/wildcard-rules/

The wildcard operator evaluates the entire field value; raw string syntax (r”/wp-*.php”) avoids double-escaping, and two consecutive unescaped asterisks are invalid.

↩
GitHub issue — anthropics/claude-code (Cloudflare WAF blocking MCP calls) — https://github.com/anthropics/claude-code/issues/64384

Cloudflare’s managed WAF rules flag MCP tool arguments containing curl invocations or raw HTTP as malicious, producing opaque ‘Error POSTing to endpoint’ failures; base64-encoding the payload is a trivial bypass that inconveniences legitimate users without providing robust security.

↩
Jasmine Directory — robots.txt vs AI blockers — https://www.jasminedirectory.com/blog/robots-txt-vs-ai-blockers-managing-who-scrapes-your-content/

Malicious bots account for ~37% of web traffic and routinely ignore robots.txt; a hybrid of robots.txt for SEO hygiene plus a WAF for volumetric protection is the practitioner consensus, with parameter-count heuristics increasingly viewed as outdated signature-based defenses easily bypassed by AI-driven crawlers.

↩

Willison ships datasette-agent --unsafe, gates Cloudflare on one ampersand

Willison ships datasette-agent —unsafe, gates Cloudflare on one ampersand

TL;DR

datasette-agent 0.3 lets the LLM write to your database

TL;DR

One new tool, one real threshold crossed

What the safety model actually does

Where it doesn’t hold

The takeaway

Willison gates Cloudflare CAPTCHA on a single ampersand

TL;DR

The rule

Claude Code as infra glue, with a caveat

A UX patch, not a defense

Jack Sun, writing.

Willison ships datasette-agent —unsafe, gates Cloudflare on one ampersand

TL;DR

datasette-agent 0.3 lets the LLM write to your database

TL;DR

One new tool, one real threshold crossed

What the safety model actually does

Where it doesn’t hold

The takeaway

Willison gates Cloudflare CAPTCHA on a single ampersand

TL;DR

The rule

Claude Code as infra glue, with a caveat

A UX patch, not a defense

Footnotes

Jack Sun, writing.