JS Wei (Jack) Sun

Hugging Face's agent glossary makes the harness a first-class layer

Every URL the pipeline pulled into ranking for this issue — primary sources plus the supporting and contradicting findings each Researcher returned. Inline citations in the issue point back here.

← Back to the issue

Sources

Harness, Scaffold, and the AI Agent Terms Worth Getting Right huggingface.co

References

MindStudio analysis citing METR mindstudio.ai

The same model can see performance swings from 46% to 80% on the same benchmark depending solely on the quality of its harness.

MindStudio — ‘Better model vs better harness’ mindstudio.ai

METR found Claude Code beat a simple ReAct loop in only 50.7% of bootstrap samples — statistically a coin flip — and OpenAI’s Codex harness actually underperformed a generic Triframe scaffold, winning only 14.5% of the time.

Towards Data Science — workflows vs agents towardsdatascience.com

Anthropic’s framework draws the line at who decides: workflows orchestrate LLMs through predefined code paths, while agents let the LLM dynamically direct its own processes and termination.

Cognition Labs — ‘Don’t Build Multi-Agents’ youtube.com

Fragmenting context across sub-agents leads to dispersed decision-making and silent corruption that no post-processing can fix; Anthropic counters with a 90.2% win rate for lead-worker patterns but at ~15x the token cost.

kau.sh — ‘Claude Skills’ deep-dive kau.sh

On SkillsBench, human-curated skills boosted task success by over 16%, but model-generated skills often provided no measurable benefit or even degraded performance.

Firecrawl — context engineering primer firecrawl.dev

A Databricks study found model accuracy begins to drop significantly around 32,000 tokens, long before nominal million-token limits — a failure mode practitioners call ‘context rot.’

Jack Sun

Jack Sun, writing.

Engineer · Bay Area

Hands-on with agentic AI all day — building frameworks, reading what industry ships, occasionally writing them down.

Digest
All · AI Tech · AI Research · AI News
Writing
Essays
Elsewhere
Subscribe
All · AI Tech · AI Research · AI News · Essays

© 2026 Wei (Jack) Sun · jacksunwei.me Built on Astro · hosted on Cloudflare