Sources

llm 0.32a2 simonwillison.net

Release: llm 0.32a2 A bunch of useful stuff in this LLM alpha, but the most important detail is this one: Most reasoning-capable OpenAI models now use the /v1/responses endpoint instead of /v1/chat/completions . This enables interleaved reasoning across tool calls for GPT-5 class models. #1435 This means you can now see the summarized reasoning tokens when you run prompts against an OpenAI model, displayed in a different color to standard error. Use the -R or —hide-reasoning flags if you don’t…

CSP Allow-list Experiment simonwillison.net

Tool: CSP Allow-list Experiment An experiment that shows that you can load an app in a CSP-protected sandboxed iframe (see previous note ) and have a custom fetch() that intercepts CSP errors and passes them up to the parent window… which can then prompt the user to add that domain to an allow-list and then refresh the page. I built this one with GPT-5.5 xhigh running in the Codex desktop app. Tags: content-security-policy , iframes , security

datasette 1.0a29 simonwillison.net

Release: datasette 1.0a29 New TokenRestrictions.abbreviated(datasette) utility method for creating “_r” dictionaries. #2695 Table headers and column options are now visible even if a table contains zero rows. #2701 Fixed bug with display of column actions dialog on Mobile Safari. #2708 Fixed bug where tests could crash with a segfault due to a race condition between Datasette.close() and Datasette.close() . #2709 That segfault bug was gnarly . I added a mechanism to Datasette recently that woul…

References

newreleases.io (0.32a2 changelog) newreleases.io

Use the new model option -o chat_completions 1 to force the CLI to fall back to the older /v1/chat/completions code path, restoring compatibility with Ollama, vLLM, and Groq.

Sean Goedecke — ‘The Responses API’ seangoedecke.com

The Responses API is designed specifically to hide proprietary ‘thinking styles’ and implementation details from other labs while still providing a stateful experience.

The New Stack — Open Responses vs Chat Completion thenewstack.io

OpenAI claims 3–5% improvement on benchmarks such as SWE-bench and TAUBench when the same models are used via /v1/responses instead of chat/completions, attributed to ‘preserved reasoning’ across turns.

r/OpenAIDev thread on Responses API reddit.com

Early adopters reported the apparent loss of system-instruction context when using previous_response_id, and inconsistent tool-calling behavior that was not present in the original Chat Completions interface.

OpenAI Cookbook — Reasoning Items developers.openai.com

By opting into include=[‘reasoning.encrypted_content’], developers receive a proprietary encrypted blob that can be passed back so models like o3 do not have to ‘think from scratch’ under Zero Data Retention.

Simon Willison’s substack — ‘LLM 0.27 with GPT-5’ simonw.substack.com

The tool now logs detailed token usage, including ‘invisible’ reasoning tokens, to its internal SQLite database — useful given pelican-SVG generations sometimes cost over a dollar a piece.

Simon Willison — earlier CSP iframe escape test (Apr 2026) simonwillison.net

Once a CSP is parsed from a meta tag, it is immutable; the script cannot remove, modify, or overwrite it, even if the iframe is navigated to a data: URI.

Vercel Sandbox Firewall docs vercel.com

v0 utilizes a Sandbox Firewall with three modes: allow-all, deny-all, and user-defined … ‘credentials brokering’ injects secrets into egress traffic after it leaves the sandbox, ensuring the untrusted AI code never actually ‘sees’ the API keys it is using.

Anthropic — Claude Artifacts support docs support.claude.com

Anthropic uses a Model Context Protocol (MCP) to broker access to external services … users must explicitly approve individual service connections, keeping the ‘sandbox’ closed except for authorized ‘doors’.

web.dev — Sandboxed iframes web.dev

If a sandbox is configured with allow-scripts but without allow-same-origin, the ‘null’ origin remains a persistent hurdle for any automated report-uri requests, which often fail because the reporting server’s CORS policy does not whitelist ‘null’.

Team Simmer — postMessage with cross-site iframes teamsimmer.com

while the * wildcard is often used for sandboxed frames because they lack a targetable origin, it can expose report data to unintended recipients.

MDN — CSP frame-src reference developer.mozilla.org

browsers typically only re-prompt users if new permissions are requested … attackers can exploit broad host permissions already in place to inject scripts or exfiltrate cookies without triggering a new warning.

Stack Overflow — SQLite multithreaded segfault thread stackoverflow.com

use-after-free scenarios where the underlying C library attempts to access memory associated with a connection or prepared statement that has already been deallocated

Vellum.ai — GPT-5.5 capabilities review vellum.ai

GPT-5.5 achieved a state-of-the-art score of 82.7% on Terminal-Bench 2.0, significantly outpacing Claude Opus 4.7 (69.4%) and Gemini 3.1 Pro (68.5%)

Towards AI — ‘I tested all 3 GPT-5.5 variants on 20 real tasks’ pub.towardsai.net

the ‘xhigh’ setting carries a heavy premium, costing approximately 2.18 times more than ‘high’ reasoning per task

Maxim AI — Ensuring AI agent reliability in production getmaxim.ai

an agent with 90% accuracy on individual steps succeeds in a 10-step complex debugging task only 34% of the time

Datasette latest changelog docs.datasette.io

TokenRestrictions.abbreviated(datasette) utility method for creating ‘_r’ dictionaries

Datasette issue #1320 (permissions/SQL refactor) github.com

TokenRestrictions function strictly as an allowlist layered on top of the actor’s existing permissions; they cannot grant a token a permission the underlying user does not already possess

Sources

References

Jack Sun, writing.