Sources

Learning on the Shop floor simonwillison.net

Learning on the Shop floor Tobias Lütke describes Shopify’s internal coding agent tool, River, which operates entirely in public on their Slack: River does not respond to direct messages. She politely declines and suggests to create a public channel for you and her to start working in. I myself work with river in #tobi_river channel and many followed this pattern. Every conversation is therefore searchable. Anyone at Shopify can jump in. In my own channel, there are over 100 people who, react t…

Using LLM in the shebang line of a script simonwillison.net

TIL: Using LLM in the shebang line of a script Kim_Bruning on Hacker News : But seriously, you can put a shebang on an english text file now (if you’re sufficiently brave) […] This inspired me to look at patterns for doing exactly that with LLM . Here’s the simplest, which takes advantage of LLM fragments : #!/usr/bin/env -S llm -f Generate an SVG of a pelican riding a bicycle But you can also incorporate tool calls using the -T name_of_tool option: #!/usr/bin/env -S llm -T llm_time -f Write…

Quoting James Shore simonwillison.net

Your AI coding agent, the one you use to write code, needs to reduce your maintenance costs. Not by a little bit, either. You write code twice as quick now? Better hope you’ve halved your maintenance costs. Three times as productive? One third the maintenance costs. Otherwise, you’re screwed. You’re trading a temporary speed boost for permanent indenture. […] The math only works if the LLM decreases your maintenance costs, and by exactly the inverse of the rate it adds code. If you double you…

Building Blocks for Foundation Model Training and Inference on AWS huggingface.co

Amazon’s engineering post on Hugging Face walks through the infrastructure pieces teams assemble to train and serve foundation models on AWS, covering compute, storage, and orchestration choices for large-scale runs. The guide targets practitioners standing up pipelines rather than picking a managed service.

References

ZenML LLMOps Database — Shopify River case study zenml.io

5,938 employees collaborated with River across 4,450 public channels in a 30-day window… River authored 1,870 pull requests in a single week (~1/8 of all merged PRs), and its merge rate climbed from 36% to 77% over two months without any underlying model upgrade.

Digital Commerce 360 — leaked Shopify AI memo digitalcommerce360.com

Before asking for more headcount and resources, teams must demonstrate why they cannot get what they want done using AI… reflexive AI usage is now a baseline expectation at Shopify.

ABA News commentary on River ababnews.com

Some users have expressed concerns that such a high degree of transparency and reliance on shared patterns might lead to a ‘hive mind’ that could inadvertently stifle unconventional innovation.

Sharadja.in — Midjourney’s Discord community model sharadja.in

Unless users pay for high-tier plans to access ‘Stealth Mode,’ every prompt and image remains searchable by the community, which is often a non-starter for companies handling sensitive or proprietary designs.

Microsoft Tech Community — Copilot Spaces / Anthropic Claude Code Channels comparison techcommunity.microsoft.com

Claude Code Channels act as persistent, purpose-built workspaces for AI agents — comparable to Slack channels for human teams — letting developers segment work by feature or team while avoiding ‘context pollution.’

Exceeds.ai — AI coding assistant risk report blog.exceeds.ai

AI-generated code has a 2.7x higher vulnerability density compared to human-written code… Claude Code-assisted commits showed a 3.2% secret-leak rate, more than double the industry baseline.

Hacker News comment by Kim_Bruning (item 48090590) news.ycombinator.com

But seriously, you can put a shebang on an english text file now (if you’re sufficiently brave)

Medium — David Baek on prompt injection medium.com

a malicious prompt might be caught by a filter 90% of the time but succeed on the 10th attempt due to minor variations in the model’s reasoning path

Keysight blog on CVE-2024-8309 (LangChain GraphCypherQAChain) keysight.com

prompt injection allowed attackers to manipulate graph database queries, leading to unauthorized data deletion or exfiltration

Kanaries docs — Python shebang / env -S portability docs.kanaries.net

The -S flag… was introduced to GNU Coreutils in version 8.30, released in July 2018… scripts intended for enterprise or legacy systems—such as RHEL 7 or older BusyBox-based embedded systems—will fail

LibHunt — llm vs aichat comparison libhunt.com

aichat is described as an ‘all-in-one’ terminal app that includes a Chat-REPL, RAG capabilities, and a shell assistant… mods… functioning similarly to grep or sed by focusing on piping stdin to an LLM

Help Net Security — TrustFall AI coding CLI research helpnetsecurity.com

agentic CLI tools can be compromised by malicious project configuration files that trick the assistant into running unauthorized helper programs

METR randomized controlled trial (July 2025) metr.org

developers were 19% slower on average when using AI tools compared to working manually… Before the experiment, participants expected a 24% speedup; after finishing, they self-reported a perceived productivity gain of roughly 20%

GitClear 2025 code quality report gitclear.com

in 2024, the volume of ‘copy/pasted’ lines exceeded ‘moved’ lines for the first time in history… an eightfold increase in duplicated code blocks (five or more lines) in 2024, making redundancy levels ten times higher than in 2022

DORA 2024 report (via getdx.com) getdx.com

a 25% increase in AI adoption is associated with a 1.5% drop in delivery throughput and a 7.2% decrease in delivery stability

Hacker News discussion (item 48089289) news.ycombinator.com

AI has become highly effective at ‘wrangling’ legacy code—modernizing old libraries, automating end-to-end tests… Some users reported that AI makes older, multi-decade projects ‘suddenly easier to work with’

LinearB analysis of agent refactoring operations linearb.io

53.9% of agent-driven refactorings are low-level, consistency-oriented tasks like renaming variables or updating types… Human developers are significantly more likely to perform high-level architectural changes

Moonlight review of ‘To What Extent Does Agent-Generated Code Require Maintenance?’ themoonlight.io

approximately 15% of AI-generated code introduces immediate quality issues, with code smells accounting for nearly 90% of these defects… roughly 23% of AI-introduced problems persist in repositories long-term

Sources

References

Jack Sun, writing.