JS Wei (Jack) Sun

Nemotron Nano Omni is a sub-agent; Codex's prompt is a post-mortem

Every URL the pipeline pulled into ranking for this issue — primary sources plus the supporting and contradicting findings each Researcher returned. Inline citations in the issue point back here.

← Back to the issue

Sources

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents huggingface.co

Quoting OpenAI Codex base_instructions simonwillison.net

Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query. — OpenAI Codex base_instructions , for GPT-5.5 Tags: openai , ai , llms , system-prompts , prompt-engineering , codex-cli , generative-ai , gpt

References

BenchLM head-to-head leaderboard benchlm.ai

Qwen3.6-27B currently holds an aggregate lead (74 to 56), showing superior performance in coding and complex mathematical reasoning

Open WebUI GitHub discussion #24264 github.com

files exceeding 10-15 seconds can trigger transcription errors (e.g., ‘NoneType’ object has no attribute ‘strip’) or persistent disconnection alerts

E2E Networks ASR benchmark (L4 GPU) e2enetworks.com

Parakeet-TDT 0.6B has recorded a 6.34% WER, surpassing Whisper Large-v3 Turbo’s 7.8%… a 35-minute audio file being processed in 18 seconds by Parakeet TDT, while Whisper Turbo took 3 minutes

vLLM project blog on Nemotron-Omni serving vllm.ai

EVS integrated directly into the serving pipeline… Independent testing indicates that EVS can reduce time-to-first-token (TTFT) by up to 4x with minimal impact on accuracy

The Decoder the-decoder.com

pass-rate filtering: prompts that the model can already solve with over 80% accuracy at initialization are discarded, focusing RL efforts on complex, unsolved cases

K-Dense AI partner deployment notes k-dense.ai

many implementers still pair the Nano Omni (as the ‘eyes and ears’) with larger models like Nemotron-3 Ultra for high-level decision-making

Engadget — ‘ChatGPT developed a goblin obsession after OpenAI tried to make it nerdy’ engadget.com

Reward signals meant to encourage a ‘Nerdy’ personality accidentally over-rewarded creature-heavy metaphors, causing ‘goblin’ to appear in outputs nearly 175% more often than in previous iterations.

Developpez.com — OpenAI post-mortem coverage intelligence-artificielle.developpez.com

Le signal de récompense a été repris dans les données de fine-tuning supervisé suivantes, ‘cuisant’ le tic dans l’architecture du modèle — OpenAI a dû écrire quatre fois ‘ne parle jamais de gobelins’ dans le prompt de Codex.

VentureBeat — ‘Why OpenAI’s goblin problem matters’ venturebeat.com

The ‘Release the Goblins’ problem: the ironic tendency of negative constraints to act as attractors rather than deterrents… OpenAI’s own documentation tells developers to avoid ‘don’t do X’ instructions, yet their internal Codex prompt does exactly that — a ‘do as I say, not as I do’ inconsistency.

Reddit r/AI_Agents — ‘Codex’s system prompt is mostly about sandboxing’ reddit.com

Unlike Anthropic’s Claude Code, which leads with an identity as a ‘software engineering task’ agent, the Codex prompt focuses immediately on filesystem sandboxing rules… instructing the model to split commands at pipes or && to evaluate each segment against security restrictions.

Business Insider — ‘OpenAI really, really wants GPT-5.5 to stop talking about goblins’ businessinsider.com

Some observers initially speculated the list functioned as ‘canary words’ — randomized tokens used to detect prompt injection or data leakage — but the thematic consistency suggested a targeted behavioral fix instead.

Reddit r/PromptEngineering — negative constraints discussion reddit.com

In a 2026 battery of 36 cross-task tests, prompts using negative-only constraints scored 72/120 versus 116/120 for affirmative framing… adding self-evident constraints to Claude-Sonnet-4.5 led to performance drops of up to 35%.

Jack Sun

Jack Sun, writing.

Engineer · Bay Area

Hands-on with agentic AI all day — building frameworks, reading what industry ships, occasionally writing them down.

Digest
All · AI Tech · AI Research · AI News
Writing
Essays
Elsewhere
Subscribe
All · AI Tech · AI Research · AI News · Essays

© 2026 Wei (Jack) Sun · jacksunwei.me Built on Astro · hosted on Cloudflare