Where does the race to automate AI research end?

Simon Lermen·LessWrong·Community·June 2, 2026

This is a linkpost of a recording of a recent MATS research talk where I argue that the automation of AI research — which OpenAI and Anthropic say is imminent — could lead to an unrecoverable alignment failure. Three properties make it especially dangerous: oversight breaks down at scale, capabilities self-amplify, and capabilities will be sped up asymmetrically faster than alignment. The outcome could be a lethal, unrecoverable alignment failure. Link to the paper preprint.Check out the recordi...

Read full article →

Where does the race to automate AI research end?

Related Articles