Sources

What’s new in pip 26.1 - lockfiles and dependency cooldowns! simonwillison.net

What’s new in pip 26.1 - lockfiles and dependency cooldowns! Richard Si describes an excellent set of upgrades to Python’s default pip tool for installing dependencies. This version drops support for Python 3.9 - fair enough, since it’s been EOL since October . macOS still ships with python3 as a default Python 3.9, so I tried out the new Python version against Python 3.14 like this: uv python install 3.14 mkdir /tmp/experiment cd /tmp/experiment python3.14 -m venv venv source venv/bin/activate…

An open-source spec for orchestration: Symphony openai.com

Learn how Symphony, an open-source spec for Codex orchestration, turns issue trackers into always-on agent systems—boosting engineering output and reducing context switching.

Introducing talkie: a 13B vintage language model from 1930 simonwillison.net

Introducing talkie: a 13B vintage language model from 1930 New project from Nick Levine , David Duvenaud , and Alec Radford (of GPT, GPT-2, Whisper fame). talkie-1930-13b-base (53.1 GB) is a “13B language model trained on 260B tokens of historical pre-1931 English text”. talkie-1930-13b-it (26.6 GB) is a checkpoint “finetuned using a novel dataset of instruction-response pairs extracted from pre-1931 reference works”, designed to power a chat interface. You can try that out here . Both models a…

microsoft/VibeVoice simonwillison.net

Microsoft’s MIT-licensed VibeVoice speech-to-text model, with built-in speaker diarization, runs locally on a 128GB M5 Max MacBook Pro via mlx-audio, transcribing an hour of podcast audio in 8 minutes 45 seconds using a 5.71GB 4-bit MLX conversion of the 17.3GB original.

Physical AI that Moves the World — Qasar Younis & Peter Ludwig, Applied Intuition latent.space

Latent Space interviews Applied Intuition CEO Qasar Younis and CTO Peter Ludwig on deploying AI into mining rigs, drones, trucks and warships — physical vehicles operating in adversarial real-world environments rather than the consumer autonomy stack the company is better known for.

How to build scalable web apps with OpenAI’s Privacy Filter huggingface.co

Hugging Face walkthrough on wiring OpenAI’s Privacy Filter into web app architectures, covering how to scrub PII from user inputs before they reach downstream LLM calls in production deployments.

Sign of the future: GPT-5.5 oneusefulthing.org

Ethan Mollick takes GPT-5.5 as a data point on the capability curve, arguing the incremental release is less interesting as a product than as evidence that scaling-era progress hasn’t flattened out the way skeptics predicted.

WHY ARE YOU LIKE THIS simonwillison.net

ChatGPT Images 2.0, prompted with a chaotic stack of horse-on-astronaut-on-pelican-on-bicycle, spontaneously added a road sign reading ‘WHY ARE YOU LIKE THIS’ — an unsolicited editorial flourish Simon Willison verified wasn’t in the user’s prompt.

References

Hacker News thread (user ‘exclipy’) news.ycombinator.com

inscrutable agent slop… lists database fields without explaining the system’s logic or state machine

InfoWorld (analyst Sanchit Vir Gogia, Greyhound Research) infoworld.com

generation scales effortlessly, but validation does not — a 500% increase in output could lead to an unmanageable review burden

r/ClaudeAI — Stokowski project announcement reddit.com

Stokowski is a Python orchestrator that swaps Codex for Claude Code, using a WORKFLOW.md with Jinja2 templates so per-run instructions don’t pollute CLAUDE.md

Debuggr.io — 2026 AI code review telemetry debuggr.io

PR volume surged 98% after heavy AI adoption while human review time rose 91%; senior engineers spend 4.3 minutes per AI suggestion vs 1.2 minutes for human code

Techsy.io — background coding agents compared techsy.io

Devin solves ~13.86% of SWE-bench end-to-end issues and now starts at $20/month plus Agent Compute Units, competing directly with Symphony’s self-hosted model

Tembo.io — background coding agents review tembo.io

agents often ‘tread water’ — identifying the same issues repeatedly but failing to converge on a global architectural fix

Andrew Nesbitt / William Woodruff analysis nesbitt.io

approximately 80% of major supply chain attacks over an 18-month period could have been successfully blocked by implementing a 7-day cooldown period

Medium - PEP 751 review (Charlie Marsh critique) medium.com

features like uv run -p — which allows a user to install and run a specific subset of a locked graph — are impossible under the current standard… essentially a ‘more modern requirements.txt’ rather than a true universal lockfile

r/Python discussion of pip 26.1 reddit.com

extras and dependency groups cannot yet be selected at install time from a multi-use lockfile, forcing users to generate separate single-use files… pip 26.1 prohibits mixing VCS or local directory entries with hash-locked external requirements in the same file

LWN - pip 26.1 release coverage lwn.net

CVE-2026-6357 fixed an arbitrary code execution risk caused by deferred imports during pip’s self-upgrade check… CVE-2026-3219, a ‘tar-zip confusion’ attack that allowed attackers to obfuscate malicious code by making a .tar.gz archive appear as a .zip file

Pipenv docs on pylock.toml pipenv.pypa.io

Pipenv (version 2026.6.0) prioritizes [pylock.toml] over the legacy Pipfile.lock when both are present

byteiota - cooldown effectiveness byteiota.com

during the 2026 LiteLLM compromise, malicious versions were live for only three hours before being quarantined, a window easily bypassed by a standard cooldown setting

MarkTechPost marktechpost.com

models trained on text processed by conventional Optical Character Recognition (OCR) achieve only 30% of the learning efficiency of those trained on human-transcribed data… instruction-following rating improved from 2.0 to 3.4 on a five-point scale during reinforcement learning

The Decoder the-decoder.com

the model occasionally demonstrates awareness of Herbert Hoover’s re-election loss in 1932 and exhibits ‘contamination’ regarding Adolf Hitler… while the model is technically unaware of World War II, it has been observed hedging its bets on world peace, citing ‘smouldering animosities’

Substack: ‘LLMs Can’t Jump’ (Tom Zahavy / DeepMind) udaykamath.substack.com

current transformer architectures lack the abductive reasoning necessary for the seven-year conceptual journey Einstein undertook

Byteiota review byteiota.com

the model often expresses views supporting British imperialism and era-typical prejudices, such as certainties that India would never achieve independence… drift into ‘plausible nonsense’ [and would] ‘pollute your brain’ with historical hallucinations

WolfDigest benchmark summary wolfdigest.com

When the evaluation is filtered to remove anachronistic concepts, the performance gap between the vintage and modern models roughly halves… it successfully inverted a rotation cipher function; when shown an encoding function that added 5 to character positions, it correctly derived a decoding function that subtracted 5

Medium: India AI Impact Summit 2026 recap medium.com

if an AI were trained solely on scientific knowledge available up to 1911, could it independently derive the general theory of relativity by 1915?… true AGI requires the ability to generate paradigm-shifting hypotheses from first principles

Sources

References

Jack Sun, writing.