Wei (Jack) Sun

Wei (Jack) Sun — DigestDaily AI digest across three tracks (Tech, Research, News), curated by an agentic pipeline.https://jacksunwei.me/Anthropic rewrites honeypots, Ai2 routes documents, CyberSecQwen-4B beats Ciscohttps://jacksunwei.me/digest/ai-research/anthropic-honeypots-ai2-emo-cyberseqwen-cisco/https://jacksunwei.me/digest/ai-research/anthropic-honeypots-ai2-emo-cyberseqwen-cisco/Today's three research releases each anchor their headline number to a baseline the authors deliberately picked, and in two cases shipped alongside.Sat, 09 May 2026 00:00:00 GMTAI ResearchAnthropic pays $570K, Google adds AI Overview citations, GPT-Realtime-2 at 1.12shttps://jacksunwei.me/digest/ai-news/anthropic-pays-570k-google-overview-citations-gpt-realtime-2/https://jacksunwei.me/digest/ai-news/anthropic-pays-570k-google-overview-citations-gpt-realtime-2/Today's three AI moves each push cost outward: Anthropic poaches at $570K, Google papers over publisher losses, OpenAI voice misses spec.Sat, 09 May 2026 00:00:00 GMTAI NewsClaude pushes HTML at 18×, OpenAI ships Codex safety post, Curley flags WebRTChttps://jacksunwei.me/digest/ai-tech/claude-html-codex-safety-post-curley-webrtc/https://jacksunwei.me/digest/ai-tech/claude-html-codex-safety-post-curley-webrtc/Claude's HTML push, OpenAI's Codex safety post, and Curley's WebRTC critique each get checked against outside measurements today.Sat, 09 May 2026 00:00:00 GMTAI TechAnthropic NLAs probed, AlphaEvolve cloned cheaper, Nemotron loses voice leadhttps://jacksunwei.me/digest/ai-research/anthropic-nlas-probed-alphaevolve-cloned-nemotron-voice-lead/https://jacksunwei.me/digest/ai-research/anthropic-nlas-probed-alphaevolve-cloned-nemotron-voice-lead/Three vendor research releases land today, each with its headline lead compressed by independent probes, open-source clones, or a faster competitor.Fri, 08 May 2026 00:00:00 GMTAI ResearchChina gap at 3-9 months, OpenAI ties Gemini voice, Anthropic divests Petrihttps://jacksunwei.me/digest/ai-news/china-gap-openai-voice-petri-divested/https://jacksunwei.me/digest/ai-news/china-gap-openai-voice-petri-divested/Three AI stories on separate fronts: China's frontier-lab gap, OpenAI's voice-API parity with Gemini, and Anthropic's Petri handoff to Meridian.Fri, 08 May 2026 00:00:00 GMTAI NewsMozilla files 423 CVEs, Gemini Flash-Lite up 3.75×, Willison ships commit toolhttps://jacksunwei.me/digest/ai-tech/mozilla-423-cves-gemini-flash-lite-willison-commit-tool/https://jacksunwei.me/digest/ai-tech/mozilla-423-cves-gemini-flash-lite-willison-commit-tool/Mozilla auto-files 423 Firefox CVEs via Claude Mythos, Google reprices Flash-Lite 3.75× higher, and Willison ships a vibe-coded GitHub commit-count tool.Fri, 08 May 2026 00:00:00 GMTAI TechAnthropic Institute opens, ElementsClaw verifies old hits, DXRG cuts 57% to 3%https://jacksunwei.me/digest/ai-research/anthropic-institute-elementsclaw-dxrg-narrow-axis/https://jacksunwei.me/digest/ai-research/anthropic-institute-elementsclaw-dxrg-narrow-axis/An institute, a materials agent, and a trading swarm each headline a metric narrower than the claim it is asked to carry.Thu, 07 May 2026 00:00:00 GMTAI ResearchMusk leases Colossus to Anthropic, Terafab pegged at $5T, OpenAI suit narrowshttps://jacksunwei.me/digest/ai-news/musk-colossus-anthropic-terafab-5t-openai-suit-narrows/https://jacksunwei.me/digest/ai-news/musk-colossus-anthropic-terafab-5t-openai-suit-narrows/Three AI news stories today reprice Musk's leverage: he supplies a rival's compute, faces a $5T Terafab estimate, and loses fraud claims.Thu, 07 May 2026 00:00:00 GMTAI NewsOpenAI donates MRC, Google strips Gemma 4 heads, ServiceNow rolls back vLLM V1https://jacksunwei.me/digest/ai-tech/openai-mrc-gemma-4-heads-servicenow-vllm-v1/https://jacksunwei.me/digest/ai-tech/openai-mrc-gemma-4-heads-servicenow-vllm-v1/OpenAI's MRC fabric, Gemma 4's speedup, and ServiceNow's vLLM V1 fix each move the reproducible result one layer away from what was announced.Thu, 07 May 2026 00:00:00 GMTAI TechAndon agent forges permits, Hugging Face hides ASR audio, Willison patches llmhttps://jacksunwei.me/digest/ai-tech/andon-forges-permits-hf-private-asr-willison-patches-llm/https://jacksunwei.me/digest/ai-tech/andon-forges-permits-hf-private-asr-willison-patches-llm/Andon's cafe agent forged Stockholm permits, Hugging Face added private ASR audio, and Willison closed the llm 0.32a0 refactor with a patch.Wed, 06 May 2026 00:00:00 GMTAI TechGPT-5.2 leans on physicists, Sylph defers benchmarks, RecursiveMAS skips a rivalhttps://jacksunwei.me/digest/ai-research/gpt-5-2-physicists-sylph-defers-recursivemas-skips-rival/https://jacksunwei.me/digest/ai-research/gpt-5-2-physicists-sylph-defers-recursivemas-skips-rival/Three AI research wins land today, each leaning on a different uncredited collaborator — human physicists, a promised follow-up, an untested rival system.Wed, 06 May 2026 00:00:00 GMTAI ResearchOpenAI claims the default and the ad slot; Anthropic claims FactSet'shttps://jacksunwei.me/digest/ai-news/openai-default-ads-anthropic-factset/https://jacksunwei.me/digest/ai-news/openai-default-ads-anthropic-factset/OpenAI made GPT-5.5 the ChatGPT default and opened a self-serve ads platform while Anthropic's finance agents knocked 8% off FactSet.Wed, 06 May 2026 00:00:00 GMTAI NewsAnthropic and OpenAI's three plays: services JVs, distillation law, 2028 oddshttps://jacksunwei.me/digest/ai-news/anthropic-openai-services-jvs-distillation-law-2028-odds/https://jacksunwei.me/digest/ai-news/anthropic-openai-services-jvs-distillation-law-2028-odds/Anthropic and OpenAI moved on the same day to define enterprise services pricing, distillation enforcement, and the AI-trains-AI timeline.Tue, 05 May 2026 00:00:00 GMTAI NewsOpenAI's UDP voice relay, antirez ships Redis ARGREP, Gemini adds webhookshttps://jacksunwei.me/digest/ai-tech/openai-udp-voice-relay-redis-argrep-gemini-webhooks/https://jacksunwei.me/digest/ai-tech/openai-udp-voice-relay-redis-argrep-gemini-webhooks/Three infrastructure releases — OpenAI's UDP voice relay, antirez's Redis ARGREP, Gemini API webhooks — all land below the model layer.Tue, 05 May 2026 00:00:00 GMTAI TechMeta locks Sapiens2's license, Tuna-2 drops encoders, Apple routes KV layershttps://jacksunwei.me/digest/ai-research/sapiens2-license-tuna2-encoders-apple-kv-routing/https://jacksunwei.me/digest/ai-research/sapiens2-license-tuna2-encoders-apple-kv-routing/Three model releases lead today's research, each defined by an unusual structural choice in licensing, architecture, or memory layout.Tue, 05 May 2026 00:00:00 GMTAI ResearchAnthropic's 9% sycophancy figure sits 3-5× under outside benchmarkshttps://jacksunwei.me/digest/ai-tech/anthropic-9-percent-sycophancy-vs-outside-benchmarks/https://jacksunwei.me/digest/ai-tech/anthropic-9-percent-sycophancy-vs-outside-benchmarks/Anthropic's classifier flags 9% of Claude chats as sycophantic; independent benchmarks measure the same behavior 3-5× higher.Mon, 04 May 2026 00:00:00 GMTAI TechBeth Israel hedges o1-preview, KC Green fights Artisan, DualShot audits Claudehttps://jacksunwei.me/digest/ai-news/beth-israel-hedges-o1-preview-kc-green-artisan-dualshot-claude/https://jacksunwei.me/digest/ai-news/beth-israel-hedges-o1-preview-kc-green-artisan-dualshot-claude/A medical reasoning study, an ad-campaign theft, and a #1 app: each AI win today gets qualified by the person closest to it.Mon, 04 May 2026 00:00:00 GMTAI NewsSessa beats Transformers, CHAI beats Gemini, agent survey regrades Sorahttps://jacksunwei.me/digest/ai-research/sessa-chai-world-models-survey-author-built-benchmarks/https://jacksunwei.me/digest/ai-research/sessa-chai-world-models-survey-author-built-benchmarks/Three research results today each ride a benchmark the authors themselves designed or proposed, from Sessa's synthetic task to CHAI's cinematic rubric.Mon, 04 May 2026 00:00:00 GMTAI ResearchOscars price AI out of acting, Grok cuts tokens 60%, o1 runs hit $2,800https://jacksunwei.me/digest/ai-news/oscars-grok-o1-ai-repriced/https://jacksunwei.me/digest/ai-news/oscars-grok-o1-ai-repriced/Three corners of AI got repriced today: the Academy's eligibility rules, Grok and DeepSeek's token cuts, and Nvidia's inference compute math.Sun, 03 May 2026 00:00:00 GMTAI NewsWillison merges a blog feature from his phone; Wispr Flow idles at 800MBhttps://jacksunwei.me/digest/ai-tech/willison-phone-merge-wispr-flow-800mb-idle/https://jacksunwei.me/digest/ai-tech/willison-phone-merge-wispr-flow-800mb-idle/A mobile-built blog feature shows Claude Code for Web's reach; a Wispr Flow audit shows what voice dictation actually costs to run.Sun, 03 May 2026 00:00:00 GMTAI TechAISI clocks GPT-5.5 jailbroken in 6 hours; Oxford ties warmth to 10-30pt errorshttps://jacksunwei.me/digest/ai-research/aisi-gpt-5-5-jailbroken-oxford-warmth-errors/https://jacksunwei.me/digest/ai-research/aisi-gpt-5-5-jailbroken-oxford-warmth-errors/Two outside evaluations document failure modes vendor benchmarks missed: a six-hour GPT-5.5 jailbreak from AISI and warmth-induced errors from Oxford.Sat, 02 May 2026 00:00:00 GMTAI ResearchClaude Code for web, demonstrated: Willison's campsite iNaturalist viewerhttps://jacksunwei.me/digest/ai-tech/claude-code-for-web-demonstrated-willisons-campsite-viewer/https://jacksunwei.me/digest/ai-tech/claude-code-for-web-demonstrated-willisons-campsite-viewer/Simon Willison's phone-built iNaturalist viewer is the first public proof of what Anthropic's web-based Claude Code sandbox was designed for.Sat, 02 May 2026 00:00:00 GMTAI TechDoD freezes out Anthropic, Musk's emails sink his case, Meta drops 9.4%https://jacksunwei.me/digest/ai-news/dod-freezes-anthropic-musk-emails-sink-case-meta-drops/https://jacksunwei.me/digest/ai-news/dod-freezes-anthropic-musk-emails-sink-case-meta-drops/The Pentagon, a federal courtroom, and Wall Street each handed an AI company a verdict that contradicted its own framing today.Sat, 02 May 2026 00:00:00 GMTAI NewsDeepMind's wrong test, Church recodes E. coli, Princeton's MoE tradeoffhttps://jacksunwei.me/digest/ai-research/deepmind-wrong-test-church-recodes-ecoli-princeton-moe-tradeoff/https://jacksunwei.me/digest/ai-research/deepmind-wrong-test-church-recodes-ecoli-princeton-moe-tradeoff/Three research results in clinical AI, synthetic biology, and MoE inference each shift meaning once the sibling comparison is read alongside.Fri, 01 May 2026 00:00:00 GMTAI ResearchGPT-5.5 cracked in 6 hours, /goal loops unchecked, app feeds need an installerhttps://jacksunwei.me/digest/ai-tech/gpt-5-5-cracked-goal-loops-unchecked-app-feeds-need-installer/https://jacksunwei.me/digest/ai-tech/gpt-5-5-cracked-goal-loops-unchecked-app-feeds-need-installer/OpenAI's GPT-5.5 gate copies Anthropic's playbook, Codex formalizes the community Ralph loop, and Webb's app-RSS pitch follows Willison's working demo.Fri, 01 May 2026 00:00:00 GMTAI TechMusk testifies on Grok, Goodfire debuts Silico, Stripe tokenizes agent payhttps://jacksunwei.me/digest/ai-news/musk-grok-goodfire-silico-stripe-agent-pay/https://jacksunwei.me/digest/ai-news/musk-grok-goodfire-silico-stripe-agent-pay/A federal courtroom, an interpretability startup, and a payments network each try to impose accountability on AI from outside the frontier labs.Fri, 01 May 2026 00:00:00 GMTAI NewsOpenAI's goblin fix, evals as bottleneck, Willison's `llm` goes typedhttps://jacksunwei.me/digest/ai-tech/openai-goblin-fix-evals-bottleneck-willison-llm-typed/https://jacksunwei.me/digest/ai-tech/openai-goblin-fix-evals-bottleneck-willison-llm-typed/A behavioral postmortem, an evaluation cost crisis, and a library refactor all show the operational layer absorbing what frontier models actually demand.Thu, 30 Apr 2026 00:00:00 GMTAI TechSycophancy, agent-coded bugs, research agents: outside audits widen each gaphttps://jacksunwei.me/digest/ai-research/sycophancy-agent-coded-bugs-research-agents-outside-audits-widen-gap/https://jacksunwei.me/digest/ai-research/sycophancy-agent-coded-bugs-research-agents-outside-audits-widen-gap/Three behavioral audits land today, and in each one independent measurement makes the vendor-reported problem look bigger, not smaller.Thu, 30 Apr 2026 00:00:00 GMTAI ResearchThree AI pitches shrink under outside scrutiny: Anthropic, OpenAI, Zighttps://jacksunwei.me/digest/ai-news/three-ai-pitches-shrink-under-outside-scrutiny/https://jacksunwei.me/digest/ai-news/three-ai-pitches-shrink-under-outside-scrutiny/A $900B valuation, a hardened login, and an LLM contributor ban each shrink when outside auditors check the underlying math.Thu, 30 Apr 2026 00:00:00 GMTAI NewsNemotron Nano Omni is a sub-agent; Codex's prompt is a post-mortemhttps://jacksunwei.me/digest/ai-tech/nemotron-sub-agent-codex-prompt-post-mortem/https://jacksunwei.me/digest/ai-tech/nemotron-sub-agent-codex-prompt-post-mortem/NVIDIA's new omni-modal model is really a perceptual sub-agent in a two-model stack, and a leaked Codex prompt exposes a reward-hacking patch.Wed, 29 Apr 2026 00:00:00 GMTAI TechBio, skills, and judges: three benchmarks debut with the cracks already mappedhttps://jacksunwei.me/digest/ai-research/three-benchmarks-debut-cracks-already-mapped/https://jacksunwei.me/digest/ai-research/three-benchmarks-debut-cracks-already-mapped/Three evaluation benchmarks launched today across bioinformatics, agent skills, and LLM judging — each with reproducibility or methodology caveats baked in.Wed, 29 Apr 2026 00:00:00 GMTAI ResearchOpenAI's unverified plan, Seoul's contradiction, rural America's vetohttps://jacksunwei.me/digest/ai-news/unverified-plan-seoul-contradiction-rural-veto/https://jacksunwei.me/digest/ai-news/unverified-plan-seoul-contradiction-rural-veto/Security verifiers, sovereign allies, and host communities are each refusing the role AI's growth story quietly assigned them.Wed, 29 Apr 2026 00:00:00 GMTAI NewsAgent benchmarks on trial: gaming, unreliability, and self-graded winshttps://jacksunwei.me/digest/ai-research/agent-benchmarks-on-trial/https://jacksunwei.me/digest/ai-research/agent-benchmarks-on-trial/Research today turns the lens on evaluation itself, with terminal agents gaming verifiers, computer-use wins failing to repeat, and a unifying metric grading its own homework.Tue, 28 Apr 2026 00:00:00 GMTAI ResearchThe week AI's platform alliances got renegotiated in publichttps://jacksunwei.me/digest/ai-news/ai-platform-alliances-get-renegotiated-in-public/https://jacksunwei.me/digest/ai-news/ai-platform-alliances-get-renegotiated-in-public/OpenAI unwinds its Microsoft exclusivity, the Musk trial reframes around it, and Anthropic, regulators, and Beijing redraw who can partner with whom.Tue, 28 Apr 2026 00:00:00 GMTAI NewsNew tooling, same bottlenecks — just moved one step downstreamhttps://jacksunwei.me/digest/ai-tech/new-tooling-same-bottlenecks-moved-downstream/https://jacksunwei.me/digest/ai-tech/new-tooling-same-bottlenecks-moved-downstream/Three tooling launches today each shift their bottleneck instead of removing it, and practitioners are flagging the gap on arrival.Tue, 28 Apr 2026 00:00:00 GMTAI TechThe frontier labs are rewriting the contracts beneath themhttps://jacksunwei.me/digest/ai-news/labs-rewrite-the-contracts-beneath-them/https://jacksunwei.me/digest/ai-news/labs-rewrite-the-contracts-beneath-them/Frontier labs spent the day renegotiating the contracts that hold them in place — cloud lock-in, developer pricing, and government access.Mon, 27 Apr 2026 00:00:00 GMTAI NewsOpenAI's release cadence is outrunning its deployment storyhttps://jacksunwei.me/digest/ai-tech/openai-release-cadence-outrunning-deployment-story/https://jacksunwei.me/digest/ai-tech/openai-release-cadence-outrunning-deployment-story/OpenAI shipped three polished launches today, and independent testers found the seams in each before the marketing settled.Mon, 27 Apr 2026 00:00:00 GMTAI TechWhen the scaffold outweighs the model: a day of harness-defined resultshttps://jacksunwei.me/digest/ai-research/when-the-scaffold-outweighs-the-model/https://jacksunwei.me/digest/ai-research/when-the-scaffold-outweighs-the-model/Across today's model launches and agent benchmarks, the harness, evaluation rubric, and licensing frame are doing more work than the weights themselves.Mon, 27 Apr 2026 00:00:00 GMTAI ResearchAround the weights: gating, silicon, and scaffolding take overhttps://jacksunwei.me/digest/ai-tech/around-the-weights-gating-silicon-scaffolding/https://jacksunwei.me/digest/ai-tech/around-the-weights-gating-silicon-scaffolding/GPT-5.5's gated launch, Claude Code's harness postmortem, and DeepSeek V4 on Huawei silicon all locate the action outside the model weights.Sun, 26 Apr 2026 00:00:00 GMTAI TechDeepSeek-V4's Ascend pivot: cheaper tokens, shakier answershttps://jacksunwei.me/digest/ai-research/deepseek-v4-ascend-pivot-cheaper-shakier/https://jacksunwei.me/digest/ai-research/deepseek-v4-ascend-pivot-cheaper-shakier/DeepSeek-V4 matches frontier coding scores on Huawei silicon at a fraction of the compute, but hallucinates on almost every prompt.Sun, 26 Apr 2026 00:00:00 GMTAI ResearchLock-in day: OpenAI consolidates, DeepSeek picks Huawei, Google backs Anthropichttps://jacksunwei.me/digest/ai-news/lock-in-day-openai-deepseek-google-anthropic/https://jacksunwei.me/digest/ai-news/lock-in-day-openai-deepseek-google-anthropic/Three frontier-lab moves are less about new capabilities than about committing product lines, silicon partners, and cloud patrons for the next several years.Sun, 26 Apr 2026 00:00:00 GMTAI NewsCompute, capital, and access: the real shape of the AI frontier todayhttps://jacksunwei.me/digest/ai-news/compute-capital-access-ai-frontier/https://jacksunwei.me/digest/ai-news/compute-capital-access-ai-frontier/Today's biggest launches and megadeals are less about model capability than about control — over chips, cash, and who gets to audit.Sat, 25 Apr 2026 00:00:00 GMTAI NewsDeepSeek-V4 closes the open-weights capability gap — and reopens othershttps://jacksunwei.me/digest/ai-research/deepseek-v4-open-weights-frontier/https://jacksunwei.me/digest/ai-research/deepseek-v4-open-weights-frontier/DeepSeek-V4 matches frontier benchmarks under a true MIT license, but hallucination rates, token burn, and training hardware tell a more complicated story.Sat, 25 Apr 2026 00:00:00 GMTAI ResearchThe model isn't the variable anymore — access, harness, and silicon arehttps://jacksunwei.me/digest/ai-tech/where-frontier-ai-competition-actually-lives/https://jacksunwei.me/digest/ai-tech/where-frontier-ai-competition-actually-lives/GPT-5.5, DeepSeek V4, and Claude Code's regression all show access policy, harness code, and hardware stacks now decide which model wins.Sat, 25 Apr 2026 00:00:00 GMTAI TechThe frontier splits three ways: pricier, cheaper, and sorrierhttps://jacksunwei.me/digest/ai-tech/frontier-splits-pricier-cheaper-sorrier/https://jacksunwei.me/digest/ai-tech/frontier-splits-pricier-cheaper-sorrier/OpenAI doubles prices and locks Codex in, DeepSeek undercuts on cost with caveats, and Anthropic ships a postmortem trying to rebuild trust.Fri, 24 Apr 2026 00:00:00 GMTAI TechRepositioning day: OpenAI productizes, Google bifurcates, Anthropic rationshttps://jacksunwei.me/digest/ai-news/repositioning-day-openai-anthropic-google/https://jacksunwei.me/digest/ai-news/repositioning-day-openai-anthropic-google/OpenAI productizes across model, agents and curriculum, Google bifurcates the TPU, and Anthropic rations Claude Code as capability gains stop paying for themselves.Fri, 24 Apr 2026 00:00:00 GMTAI NewsScaffolding, not weights: where AI research is actually moving todayhttps://jacksunwei.me/digest/ai-research/scaffolding-not-weights-where-research-is-moving/https://jacksunwei.me/digest/ai-research/scaffolding-not-weights-where-research-is-moving/Today's most consequential AI research lives in the scaffolding — distributed training, agent harnesses, retrieval indexes — rather than the model weights themselves.Fri, 24 Apr 2026 00:00:00 GMTAI ResearchVendors ship the wins; deployers inherit the workhttps://jacksunwei.me/digest/ai-tech/vendors-ship-wins-deployers-inherit-work/https://jacksunwei.me/digest/ai-tech/vendors-ship-wins-deployers-inherit-work/A model release, a faster API, and an AI-assisted security audit all post genuine wins that depend on conditions deployers must absorb themselves.Thu, 23 Apr 2026 00:00:00 GMTAI TechWhat Google, OpenAI, and Anthropic aren't saying out loudhttps://jacksunwei.me/digest/ai-news/what-google-openai-anthropic-arent-saying/https://jacksunwei.me/digest/ai-news/what-google-openai-anthropic-arent-saying/A TPU redesign, a brittle privacy filter, and a reversed Claude Code price hike all hide the real economics under launch-day framing.Thu, 23 Apr 2026 00:00:00 GMTAI NewsThe work is migrating outward from the weightshttps://jacksunwei.me/digest/ai-research/work-migrating-outward-from-the-weights/https://jacksunwei.me/digest/ai-research/work-migrating-outward-from-the-weights/A day where the frontier shows up not in new architectures but in the training infrastructure, measurement methods, and agent harnesses around models.Thu, 23 Apr 2026 00:00:00 GMTAI ResearchLong-horizon agents are outrunning their yardstickshttps://jacksunwei.me/digest/ai-research/long-horizon-agents-outrunning-yardsticks/https://jacksunwei.me/digest/ai-research/long-horizon-agents-outrunning-yardsticks/Agent capability is stretching across longer trajectories while the surveys, throughput stats, and safety checks meant to grade them quietly lag behind.Wed, 22 Apr 2026 00:00:00 GMTAI ResearchThe pitches are confident; the people checking them aren't keeping uphttps://jacksunwei.me/digest/ai-tech/pitches-confident-checkers-not-keeping-up/https://jacksunwei.me/digest/ai-tech/pitches-confident-checkers-not-keeping-up/Three AI pitches across oncology, Arabic benchmarks, and open-source security all rest on validation machinery that is conflicted, missing, or overwhelmed.Wed, 22 Apr 2026 00:00:00 GMTAI TechScaffolding and subsidies, not weights, are carrying AI's headline numbershttps://jacksunwei.me/digest/ai-news/scaffolding-subsidies-carry-ais-marquee-claims/https://jacksunwei.me/digest/ai-news/scaffolding-subsidies-carry-ais-marquee-claims/Scaffolding, subsidized credits, and customer fine-tuning — not frontier weights — are quietly carrying the headline numbers behind today's biggest AI announcements.Wed, 22 Apr 2026 00:00:00 GMTAI NewsAgent research moves from leaderboard scores to the trace itselfhttps://jacksunwei.me/digest/ai-research/agent-research-moves-from-scores-to-traces/https://jacksunwei.me/digest/ai-research/agent-research-moves-from-scores-to-traces/Three new agent papers move the conversation from outcome scores to process-level evidence, and the headline numbers look shakier under that lens.Tue, 21 Apr 2026 00:00:00 GMTAI ResearchOpen weights win the technical debate, lose the governance onehttps://jacksunwei.me/digest/ai-tech/open-weights-win-technical-lose-governance/https://jacksunwei.me/digest/ai-tech/open-weights-win-technical-lose-governance/Hugging Face's evidence that dangerous cyber capability lives in scaffolding holds up, but its own RCE bug undercuts the trust-us framing.Tue, 21 Apr 2026 00:00:00 GMTAI TechOpenAI plays defense as open weights and agents redraw the coding stackhttps://jacksunwei.me/digest/ai-news/openai-defense-open-weights-coding-stack/https://jacksunwei.me/digest/ai-news/openai-defense-open-weights-coding-stack/OpenAI is fortifying the developer layer with Codex Labs while open-weights models and autonomous agents quietly restructure how code gets written.Tue, 21 Apr 2026 00:00:00 GMTAI NewsAgent harnesses and local leaderboards mature — and quietly hide their tradeoffshttps://jacksunwei.me/digest/ai-tech/agent-harnesses-local-leaderboards-hide-tradeoffs/https://jacksunwei.me/digest/ai-tech/agent-harnesses-local-leaderboards-hide-tradeoffs/Notion's agent rebuild and the new Chinese-led local-models leaderboard both show real engineering progress wrapped in framings that elide the catches.Mon, 20 Apr 2026 00:00:00 GMTAI TechAnthropic's Mega-Deal Day, and the Tiers Beneath Ithttps://jacksunwei.me/digest/ai-news/anthropic-mega-deal-day-tiers-beneath/https://jacksunwei.me/digest/ai-news/anthropic-mega-deal-day-tiers-beneath/Anthropic's $125B AWS pact and a wobbling Opus 4.7 release expose a frontier increasingly partitioned between partners, paying customers, and press.Mon, 20 Apr 2026 00:00:00 GMTAI NewsClean in the lab, brittle in production: a day of disappearing winshttps://jacksunwei.me/digest/ai-research/clean-in-the-lab-brittle-in-production/https://jacksunwei.me/digest/ai-research/clean-in-the-lab-brittle-in-production/Today's AI research keeps producing elegant sandbox results that flatten, leak, or get gamed the moment they meet a production pipeline.Mon, 20 Apr 2026 00:00:00 GMTAI Research