llm 0.32a2 adopts Responses API, CSP tool gates fetches, Codex repros segfault
Every URL the pipeline pulled into ranking for this issue — primary sources plus the supporting and contradicting findings each Researcher returned. Inline citations in the issue point back here.
Sources
llm 0.32a2 simonwillison.net
Release: llm 0.32a2 A bunch of useful stuff in this LLM alpha, but the most important detail is this one: Most reasoning-capable OpenAI models now use the /v1/responses endpoint instead of /v1/chat/completions . This enables interleaved reasoning across tool calls for GPT-5 class models. #1435 This means you can now see the summarized reasoning tokens when you run prompts against an OpenAI model, displayed in a different color to standard error. Use the -R or —hide-reasoning flags if you don’t…
CSP Allow-list Experiment simonwillison.net
Tool: CSP Allow-list Experiment An experiment that shows that you can load an app in a CSP-protected sandboxed iframe (see previous note ) and have a custom fetch() that intercepts CSP errors and passes them up to the parent window… which can then prompt the user to add that domain to an allow-list and then refresh the page. I built this one with GPT-5.5 xhigh running in the Codex desktop app. Tags: content-security-policy , iframes , security
datasette 1.0a29 simonwillison.net
Release: datasette 1.0a29 New TokenRestrictions.abbreviated(datasette) utility method for creating “_r” dictionaries. #2695 Table headers and column options are now visible even if a table contains zero rows. #2701 Fixed bug with display of column actions dialog on Mobile Safari. #2708 Fixed bug where tests could crash with a segfault due to a race condition between Datasette.close() and Datasette.close() . #2709 That segfault bug was gnarly . I added a mechanism to Datasette recently that woul…
References
newreleases.io (0.32a2 changelog) newreleases.io
Use the new model option
-o chat_completions 1to force the CLI to fall back to the older /v1/chat/completions code path, restoring compatibility with Ollama, vLLM, and Groq.
Sean Goedecke — ‘The Responses API’ seangoedecke.com
The Responses API is designed specifically to hide proprietary ‘thinking styles’ and implementation details from other labs while still providing a stateful experience.
The New Stack — Open Responses vs Chat Completion thenewstack.io
OpenAI claims 3–5% improvement on benchmarks such as SWE-bench and TAUBench when the same models are used via /v1/responses instead of chat/completions, attributed to ‘preserved reasoning’ across turns.
r/OpenAIDev thread on Responses API reddit.com
Early adopters reported the apparent loss of system-instruction context when using previous_response_id, and inconsistent tool-calling behavior that was not present in the original Chat Completions interface.
OpenAI Cookbook — Reasoning Items developers.openai.com
By opting into include=[‘reasoning.encrypted_content’], developers receive a proprietary encrypted blob that can be passed back so models like o3 do not have to ‘think from scratch’ under Zero Data Retention.
Simon Willison’s substack — ‘LLM 0.27 with GPT-5’ simonw.substack.com
The tool now logs detailed token usage, including ‘invisible’ reasoning tokens, to its internal SQLite database — useful given pelican-SVG generations sometimes cost over a dollar a piece.
Simon Willison — earlier CSP iframe escape test (Apr 2026) simonwillison.net
Once a CSP is parsed from a meta tag, it is immutable; the script cannot remove, modify, or overwrite it, even if the iframe is navigated to a data: URI.
Vercel Sandbox Firewall docs vercel.com
v0 utilizes a Sandbox Firewall with three modes: allow-all, deny-all, and user-defined … ‘credentials brokering’ injects secrets into egress traffic after it leaves the sandbox, ensuring the untrusted AI code never actually ‘sees’ the API keys it is using.
Anthropic — Claude Artifacts support docs support.claude.com
Anthropic uses a Model Context Protocol (MCP) to broker access to external services … users must explicitly approve individual service connections, keeping the ‘sandbox’ closed except for authorized ‘doors’.
web.dev — Sandboxed iframes web.dev
If a sandbox is configured with allow-scripts but without allow-same-origin, the ‘null’ origin remains a persistent hurdle for any automated report-uri requests, which often fail because the reporting server’s CORS policy does not whitelist ‘null’.
Team Simmer — postMessage with cross-site iframes teamsimmer.com
while the * wildcard is often used for sandboxed frames because they lack a targetable origin, it can expose report data to unintended recipients.
MDN — CSP frame-src reference developer.mozilla.org
browsers typically only re-prompt users if new permissions are requested … attackers can exploit broad host permissions already in place to inject scripts or exfiltrate cookies without triggering a new warning.
Stack Overflow — SQLite multithreaded segfault thread stackoverflow.com
use-after-free scenarios where the underlying C library attempts to access memory associated with a connection or prepared statement that has already been deallocated
Vellum.ai — GPT-5.5 capabilities review vellum.ai
GPT-5.5 achieved a state-of-the-art score of 82.7% on Terminal-Bench 2.0, significantly outpacing Claude Opus 4.7 (69.4%) and Gemini 3.1 Pro (68.5%)
Towards AI — ‘I tested all 3 GPT-5.5 variants on 20 real tasks’ pub.towardsai.net
the ‘xhigh’ setting carries a heavy premium, costing approximately 2.18 times more than ‘high’ reasoning per task
Maxim AI — Ensuring AI agent reliability in production getmaxim.ai
an agent with 90% accuracy on individual steps succeeds in a 10-step complex debugging task only 34% of the time
Datasette latest changelog docs.datasette.io
TokenRestrictions.abbreviated(datasette) utility method for creating ‘_r’ dictionaries
Datasette issue #1320 (permissions/SQL refactor) github.com
TokenRestrictions function strictly as an allowlist layered on top of the actor’s existing permissions; they cannot grant a token a permission the underlying user does not already possess