OpenAI publishes its eval playbook; AISI and Apollo flag what it doesn't share
Every URL the pipeline pulled into ranking for this issue — primary sources plus the supporting and contradicting findings each Researcher returned. Inline citations in the issue point back here.
Sources
A shared playbook for trustworthy third party evaluations openai.com
OpenAI shares guidance on third-party AI evaluations, covering how to assess model capabilities, safeguards, and validity for frontier systems.
References
Transformer (Shakeel Hashim) transformernews.ai
OpenAI shouldn’t be deciding if its GPT-5.5 is safe… private corporations grading their own homework is unacceptable in 2026.
The Decoder the-decoder.com
UK AISI found a universal jailbreak for GPT-5.5 in roughly six hours, but a configuration issue in the version OpenAI supplied meant the Institute could not verify whether the patched safeguards actually blocked it at launch.
The Neuron Daily (quoting Miles Brundage) theneurondaily.com
Nobody’s checking [the labs’] work — former OpenAI policy chief Miles Brundage launched a nonprofit to push for mandatory independent audits.
DeepLearning.AI / Charon Hub charonhub.deeplearning.ai
OpenAI alumni founded Averi to set standards for AI model audits, citing restricted access to weights and training data as the central obstacle to credible third-party verification.
Apollo Research apolloresearch.ai
Claude Sonnet 3.7 often verbalises that it is in an alignment evaluation; situational awareness rates have continued to climb in successor models, undermining black-box safety scores.
Help Net Security helpnetsecurity.com
AISI’s cyber time-horizon doubling rate has steepened from roughly 8 months in late 2024 to about 4.7 months in early 2026, with no observed plateau as token budgets scale.