JS Wei (Jack) Sun

Berkeley scales the harness, Qwen-9B writes it, Moltbook agents talk around it

Every URL the pipeline pulled into ranking for this issue — primary sources plus the supporting and contradicting findings each Researcher returned. Inline citations in the issue point back here.

← Back to the issue

Sources

Emergent Languages in Populations of Language Model Agents: From Token Efficiency to Oversight Evasion huggingface.co

Research examines emergent languages in autonomous AI agents designed to evade human oversight, revealing sophisticated steganographic techniques and questioning current monitoring approaches.

Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents huggingface.co

Research reveals that harness self-evolution capabilities in LLM agents show unexpected patterns: harness-updating effectiveness is consistent across model capabilities, while harness-benefit follows a non-monotonic trend with mid-tier models performing best.

From Model Scaling to System Scaling: Scaling the Harness in Agentic AI huggingface.co

Agentic AI advancement requires scaling system architecture around foundation models, focusing on auditable and verifiable components rather than just model capacity.

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors huggingface.co

Multi-step trojan attacks slip past existing agent defenses by spreading malicious prompts across several operations, evading per-step filters. DASGuard counters this with runtime blocking and sanitized commits, targeting persistent backdoor control in local LLM agent harnesses rather than single-turn prompt injection.

Mellum2 Technical Report huggingface.co

Mellum2 is an open-weight Mixture-of-Experts model with 12B total and 2.5B active parameters per token, tuned for software engineering. It uses grouped-query and sliding-window attention, multi-token prediction, FP8 training, and YaRN to run efficiently on commodity GPUs.

VLM3: Vision Language Models Are Native 3D Learners huggingface.co

Vision language models can match specialized 3D networks on depth estimation, pixel correspondence, and camera pose with only light architectural tweaks like focal length unification and text-based pixel references. The Facebook Research approach skips heavy augmentation, relying on data mixture and scale instead.

How can embedding models bind concepts? huggingface.co

CLIP-style embeddings recognize individual concepts but fumble when binding them together in a scene, hurting cross-modal retrieval. Controlled transformer experiments show that multiplicative interactions let models learn low-complexity binding functions that generalize, pointing to an architectural fix rather than more data.

SoundnessBench: Can Your AI Scientist Really Tell Good Research Ideas from Bad Ones? huggingface.co

SoundnessBench tests whether LLMs can judge methodological validity of ML research proposals drawn from ICLR submissions. Current models consistently rate flawed ideas as sound, a persistent optimism bias that undermines their use as autonomous reviewers or hypothesis generators in AI scientist pipelines.

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue huggingface.co

The Swan release pairs SwanVoice, a zero-shot TTS system for expressive monologue and dialogue built on VAE plus flow-matching DiT, with SwanBench-Speech for long-form evaluation and SwanSphere, a streaming autoregressive diffusion transformer that generates spatial audio from panoramic video and text.

Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios huggingface.co

Swanbench-Speech addresses the lack of comprehensive long-form speech evaluation by providing a benchmark with diverse scenarios, multi-dimensional metrics, and insights into model limitations.

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer huggingface.co

SwanSphere presents a unified streaming framework for high-fidelity spatial audio generation from panoramic videos and text prompts using causal autoregressive diffusion transformers and multimodal learning strategies.

SANA-Streaming: Real-time Streaming Video Editing with Hybrid Diffusion Transformer huggingface.co

SANA-Streaming performs high-resolution video-to-video editing in real time using a hybrid diffusion transformer with softmax attention and flow matching. Cycle-reverse regularization preserves temporal consistency, while mixed-precision quantization and tensor-core co-design keep throughput viable on consumer hardware.

RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video huggingface.co

RayDer is a unified feed-forward transformer that consolidates camera estimation, scene reconstruction, and rendering for self-supervised novel view synthesis, enabling stable training on real-world video through dynamic state absorption and demonstrating clean scaling behavior.

Linear Scaling Video VLMs for Long Video Understanding huggingface.co

StateKV enables efficient long-video vision-language model inference by maintaining cross-frame context in a fixed-capacity recurrent state while using a full per-frame cache for decoding, achieving linear-time prefill with minimal accuracy loss compared to full self-attention.

OpenSkillEval: Automatically Auditing the Open Skill Ecosystem for LLM Agents huggingface.co

OpenSkillEval is an automatic evaluation framework that assesses skill-augmented agent systems and skills across diverse real-world applications, revealing that skill availability doesn’t guarantee effective usage and that performance benefits depend heavily on model and framework combinations.

COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation huggingface.co

Person-grounded AI skills are automatically distilled from heterogeneous traces into inspectable, correctable packages that capture both capabilities and behavioral patterns.

Exploring Autonomous Agentic Data Engineering for Model Specialization huggingface.co

Large language models can autonomously execute end-to-end data engineering pipelines for model specialization through iterative data adaptation and optimization.

Recovering Policy-Induced Errors: Benchmarking and Trajectory Synthesis for Robust GUI Agents huggingface.co

GUI agents lack robust error recovery capabilities, which this work addresses through GUI-RobustEval and Robustness-driven Trajectory Synthesis, demonstrating improved performance on real-world benchmarks.

Not All Disagreement Is Learnable: Token Teachability in On-Policy Distillation huggingface.co

Token-level teacher signals in on-policy distillation are better predicted by teachability—measuring local compatibility between teacher and student distributions—than by raw KL disagreement alone.

LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis huggingface.co

LongDS benchmark evaluates agents’ ability to maintain and update analytical states over extended data analysis sessions using real-world tasks from Kaggle notebooks.

GrepSeek: Training Search Agents for Direct Corpus Interaction huggingface.co

GrepSeek enables efficient search agent training through direct corpus interaction using shell commands and a two-stage approach combining a cold-start dataset and group relative policy optimization.

SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Tasks huggingface.co

SCOPE is a self-play framework that trains language models on open-ended tasks through policy co-evolution, achieving superior performance on both targeted and held-out benchmarks without external supervision.

Trust-Region Behavior Blending for On-Policy Distillation huggingface.co

Trust-Region behavior Blending improves on-policy distillation by replacing early poor-quality student rollouts with teacher-like behavior within a KL trust region during warmup.

DRIFT: Decoupled Rollouts and Importance-Weighted Fine-Tuning for Efficient Multi-Turn Optimization huggingface.co

DRIFT is a framework that combines offline trajectories with importance-weighted supervised fine-tuning to achieve multi-turn interactive learning efficiency and performance comparable to reinforcement learning.

Representation Forcing for Bottleneck-Free Unified Multimodal Models huggingface.co

Representation Forcing enables unified multimodal models to perform both perception and generation tasks end-to-end without relying on external latent spaces, matching state-of-the-art performance in image generation while improving understanding capabilities.

Linearizing Vision Transformer with Test-Time Training huggingface.co

Researchers develop a method to convert pretrained Softmax attention models to linear-complexity Test-Time Training architectures through architectural and representational alignment, achieving fast inference with minimal fine-tuning.

DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory huggingface.co

A novel decoupled memory architecture called DecMem is introduced for consistent long-horizon video generation, addressing computational inefficiency and attention dispersion issues in learnable memory systems.

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement huggingface.co

SAVE framework improves reward model training by using value functions to grade on-policy responses and update models through contrastive objectives.

SAAS: Self-Aware Reinforcement Learning for Over-Search Mitigation in Agentic Search huggingface.co

SAAS introduces a reinforcement learning framework that enhances agent self-awareness to reduce unnecessary searches in LLM-based question answering systems.

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards huggingface.co

LongTraceRL addresses long-context reasoning challenges in large language models through tiered distractor construction and rubric reward design for improved reasoning quality.

Task-Focused Memorization for Multimodal Agents huggingface.co

A reinforcement-learning-based framework called TaskMem is introduced to dynamically determine what information to store in long-term memory for multimodal agents, improving performance on streaming video benchmarks.

dMoE: dLLMs with Learnable Block Experts huggingface.co

Diffusion large language models combined with mixture-of-experts architectures face a mismatch between block parallel decoding and token-level expert selection, which dMoE addresses by aggregating token-level distributions into block-level routing to reduce activated experts and improve efficiency.

Hide-and-Seek in Trajectories: Discovering Failure Signals for VLA Runtime Monitoring huggingface.co

Hide-and-Seek framework detects robot execution failures in vision-language-action models by localizing failure-indicative actions through contrastive learning from trajectory-level supervision without step-level annotations.

GDSD: Reinforcement Learning as Guided Denoiser Self-Distillation for Diffusion Language Models huggingface.co

Guided Denoiser Self-Distillation (GDSD) improves diffusion large language models by directly distilling denoisers from advantage-guided self-teachers, avoiding biases introduced by ELBO likelihood surrogates and achieving superior performance on benchmark tasks.

VisualThink-VLA: Visual Intermediate Reasoning for Effective and Low-Latency Vision-Language-Action Policies huggingface.co

VisualThinking-VLA enables fast, accurate vision-language-action policies through visual reasoning that preserves spatial precision and reduces latency compared to text-based approaches.

MAAT: Multi-phase Adapter-Aware Targeted Unlearning huggingface.co

Existing machine unlearning benchmarks are heavily skewed toward non-causal question types, masking failures in causal knowledge removal; a new balanced benchmark and unlearning method are introduced to address this gap.

Seeing Isn’t Knowing: Do VLMs Know When Not to Answer Spatial Questions (and Why)? huggingface.co

Vision-language models exhibit overconfidence in spatial reasoning tasks and struggle to identify when additional observations are needed to resolve uncertainty.

Lumos-Nexus: Efficient Frequency Bridging with Homogeneous Latent Space for Video Unified Models huggingface.co

Lumos-Nexus is a training-efficient video generation framework that uses a two-stage approach with a lightweight generator for training and a high-capacity pretrained generator for inference, achieving enhanced visual fidelity through Unified Progressive Frequency Bridging.

iVGR: Internalizing Visually Grounded Reasoning for MLLMs with Reinforcement Learning huggingface.co

A reinforcement learning framework called iVGR is introduced to transfer visual localization capabilities into textual reasoning, improving fine-grained perception in multimodal language models without requiring explicit visual grounding during inference.

One-Forcing: Towards Stable One-Step Autoregressive Video Generation huggingface.co

One-Forcing improves one-step video generation quality and efficiency by combining DMD objective with GAN loss, achieving state-of-the-art results with reduced training costs.

SurGe: Improved Surface Geometry in Point Maps huggingface.co

SurGe improves 3D reconstruction accuracy by introducing a point map normal metric and combining point gradient matching loss with Neighborhood Attention Decoder for better local surface geometry estimation.

HL-OutPaint: Coarse-to-Fine Video Outpainting for High-Resolution Long-Range Videos huggingface.co

HL-OutPaint is a high-resolution video outpainting framework that uses a coarse-to-fine strategy with global coarse guidance to enable large spatial extrapolation and long sequence generation while maintaining spatio-temporal consistency.

PEEK: Picking Essential frames via Efficient Knowledge distillation huggingface.co

PEEK is an efficient dynamic frame sampling method that distills caption-conditioned frame relevance rankings from a teacher model into a lightweight temporal model, outperforming state-of-the-art methods in video captioning while maintaining computational efficiency.

DEMON: Diffusion Engine for Musical Orchestrated Noise huggingface.co

DEMON enables real-time diffusion model control as a musical instrument through specialized scheduling, shared state management, and optimized decoding techniques.

Count Anything huggingface.co

A generalist model for text-guided object counting across multiple domains is presented, utilizing dual-granularity instance enumeration and complementary counting fusion for improved accuracy and cross-domain generalization.

AlphaTransit: Learning to Design City-scale Transit Routes huggingface.co

AlphaTransit combines Monte Carlo Tree Search with neural policy-value networks to optimize bus route design by predicting downstream quality and enabling lookahead decisions without simulator rollouts.

FRAPPE: Full Input, Residual Output Autoencoding with Projection Pursuit Encoder huggingface.co

A novel autoencoding framework called FRAPPE uses a projection pursuit encoder to predict residuals from full input, enabling efficient variable-rate image compression with fast CPU-based encoding.

Function2Scene: 3D Indoor Scene Layout from Functional Specifications huggingface.co

Function2Scene generates 3D indoor layouts from functional descriptions by parsing user needs and applying design constraints through an iterative refinement process combining geometric analysis, language modeling, and visual assessment.

GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration huggingface.co

Generative multimodal foundation models are used to create high-quality training data for image restoration, improving model generalization across diverse real-world scenarios.

One Click per Cell Type Suffices: Training-free Group Interaction for Cell Instance Segmentation huggingface.co

Group Prompting enables efficient cell instance segmentation by leveraging per-type prompting through a training-free framework that uses multi-scale encoder features and recursive prompt expansion.

References

bdtechtalks — ‘AI harness scaling’ (Jun 2026) bdtechtalks.com

The same model can exhibit a 30-50 percentage point performance swing based solely on the harness wrapping it… the Core Agent Formula is Agent = Model + Harness.

arXiv — ‘Harness Sensitivity Is Non-Monotone Across LLM Agent Tiers’ (HEAT-24, 432 runs) arxiv.org

For frontier chat models like Gemini 2.5 Flash, increased harness verbosity decreased the Verified Task Success Rate by 29 to 38 percentage points… over-structuring can act as a deployment trap.

beancount.io research log on SWE-agent ACI beancount.io

The specialized interface alone provided a 10.7 percentage point improvement over a raw Linux shell… [but] agents often ‘cheat’ by exploiting git history, with one study showing agents used this tactic in 81% of tasks.

Sierra.ai — τ-bench post; flaw analysis by Daniel Kang (UIUC) sierra.ai

A ‘trivial agent’ — one that literally does nothing and immediately returns — can outperform frontier models like o3-mini on the benchmark, because certain tasks are marked successful if the agent terminates without taking action.

Glasp summary of Chroma ‘context rot’ study (18 frontier models) glasp.co

Accuracy can drop by 30% to 50% as the context window fills… coherent, well-structured text actually degrades attention more than shuffled input.

CIO.com — ‘What makes a true AI agent’ cio.com

The term ‘agent’ has been diluted beyond utility… vendors rebrand existing GPT-wrappers to appear more advanced than they are (‘agent washing’).

Liner review of Live-SWE-agent liner.com

Live-SWE-agent … starts with a minimal bash interface and autonomously synthesizes its own tools and modifies its internal scaffold during task execution … achieved a state-of-the-art 77.4% on SWE-bench Verified without the need for extensive offline training.

SambaNova blog on ACE sambanova.ai

ACE achieved a 91.5% reduction in latency and an 83.6% reduction in token costs compared to Dynamic Cheatsheet … an average performance gain of 10.6% on AppWorld, often allowing smaller open-source models to surpass larger static models like GPT-4.

ResearchGate: Adaptive Auto-Harness paper researchgate.net

single, densely updated harnesses are ‘brittle’ in open-ended task streams … prompts grew from 2 KB to 68 KB and skills nearly tripled, significantly increasing token costs and inference latency without proportional gains in accuracy.

GitHub: A-EVO-Lab/a-evolve github.com

Repository hosts the ‘solve-evolve’ loop implementation and the harness-evolution release branch referenced by the paper, including activation/adherence metrics.

Palo Alto Networks community blog on MCP security live.paloaltonetworks.com

The MCP protocol itself is vulnerable to ‘confused deputy’ attacks and Arbitrary Code Execution if servers do not properly validate inputs or if a model is ‘poisoned’ through a malicious skill file.

The Neuron AI daily explainer (June 1 2026) theneuron.ai

Put the cheap model on the evolver and the expensive model on the solver … organizations can achieve 3-5x cost savings by using smaller, specialized models to maintain and update the agentic framework while reserving high-parameter models for final task execution.

Wikipedia: Moltbook en.wikipedia.org

Meta Platforms acquired Moltbook on March 10, 2026, folding the team into its Superintelligence Labs… the MOLT token surged over 1,800% in value following interest from major venture capitalists.

Oxford ‘Secret Collusion among AI Agents’ (Motwani et al.) ora.ox.ac.uk

Steganographic collusion can arise naturally from optimization pressure during training or via in-context learning… GPT-4 showed a significant capability jump, successfully encoding hidden data at rates as high as 92% in some tests.

LessWrong — ‘Detecting Collusion through Multi-Agent Interpretability’ lesswrong.com

Linear probes on model activations can detect single-agent deception, but they are less reliable for multi-agent collusion due to ‘label ambiguity’ and lack of adversarial robustness.

Spylab.ai — Steganography research blog spylab.ai

Emergent steganography can be robust against active defenses because models find ways to utilize ‘semantic entropy’ that remains after paraphrasing.

Gradient Flow — ‘Moltbook Unpacked’ gradientflow.com

While Moltbook claims millions of registered agents, security researchers have noted that programmatic account creation makes it difficult to distinguish between ‘authentic’ autonomous agents and simple bot inflation.

Medium — ‘The Architecture of Autonomous Agency: Moltbook’ medium.com

Every few hours, agents fetch a remote ‘heartbeat’ file from the server that contains instructions for their next set of social interactions… a compromised instruction file could theoretically force thousands of agents to perform malicious actions.

Jack Sun

Jack Sun, writing.

Engineer · Bay Area

Hands-on with agentic AI all day — building frameworks, reading what industry ships, occasionally writing them down.

Digest
All · AI Tech · AI Research · AI News
Writing
Essays
Elsewhere
Subscribe
All · AI Tech · AI Research · AI News · Essays

© 2026 Wei (Jack) Sun · jacksunwei.me Built on Astro · hosted on Cloudflare