JS Wei (Jack) Sun

OpenAI publishes its eval playbook; AISI and Apollo flag what it doesn't share

Every URL the pipeline pulled into ranking for this issue — primary sources plus the supporting and contradicting findings each Researcher returned. Inline citations in the issue point back here.

← Back to the issue

Sources

A shared playbook for trustworthy third party evaluations openai.com

OpenAI shares guidance on third-party AI evaluations, covering how to assess model capabilities, safeguards, and validity for frontier systems.

References

Transformer (Shakeel Hashim) transformernews.ai

OpenAI shouldn’t be deciding if its GPT-5.5 is safe… private corporations grading their own homework is unacceptable in 2026.

The Decoder the-decoder.com

UK AISI found a universal jailbreak for GPT-5.5 in roughly six hours, but a configuration issue in the version OpenAI supplied meant the Institute could not verify whether the patched safeguards actually blocked it at launch.

The Neuron Daily (quoting Miles Brundage) theneurondaily.com

Nobody’s checking [the labs’] work — former OpenAI policy chief Miles Brundage launched a nonprofit to push for mandatory independent audits.

DeepLearning.AI / Charon Hub charonhub.deeplearning.ai

OpenAI alumni founded Averi to set standards for AI model audits, citing restricted access to weights and training data as the central obstacle to credible third-party verification.

Apollo Research apolloresearch.ai

Claude Sonnet 3.7 often verbalises that it is in an alignment evaluation; situational awareness rates have continued to climb in successor models, undermining black-box safety scores.

Help Net Security helpnetsecurity.com

AISI’s cyber time-horizon doubling rate has steepened from roughly 8 months in late 2024 to about 4.7 months in early 2026, with no observed plateau as token budgets scale.

Jack Sun

Jack Sun, writing.

Engineer · Bay Area

Hands-on with agentic AI all day — building frameworks, reading what industry ships, occasionally writing them down.

Digest
All · AI Tech · AI Research · AI News
Writing
Essays
Elsewhere
Subscribe
All · AI Tech · AI Research · AI News · Essays

© 2026 Wei (Jack) Sun · jacksunwei.me Built on Astro · hosted on Cloudflare