Emergent introspection does not replicate on Llama-3.1-405B

·LessWrong··

TL;DR: I couldn't replicate Lindsey (2026) (LW post) on Llama-3.1-405B-Instruct (0/400 trials, 0/80 false positives in the canonical sweep; null robust to layer ∈ {70, 84, 90, 100} and to a dense magnitude grid at the suspected crossover). So: how does introspection emerge in models? The optimistic reading of Lindsey's result ("post-trained LLMs at sufficient scale can introspect") doesn't hold up unless Opus is much larger than 405B (Anthropic doesn't disclose its size) or Anthropic's post-trai...

Read full article →

Related Articles

An OpenAI model has disproved a central conjecture in discrete geometry
tedsanders · Hacker News · 20h ago
GitHub confirms breach of 3,800 repos via malicious VSCode extension
Timofeibu · Hacker News · 1d ago
Show HN: Rmux – A programmable terminal multiplexer with a Playwright-style SDK
shideneyu · Hacker News · 6h ago
Incident Report: May 19, 2026 – GCP Account Suspension
0xedb · Hacker News · 1d ago
Not alive, but not dead: disembodied human brains used for drug testing
Timofeibu · Hacker News · 19h ago