We cannot safely automate value alignment evaluation and research without thinking about delegation and discretion by Maria Federica Martino Lena

Maria Federica Martino Lena ·Nuno Sempere·Forecasting·May 17, 2026

This esssay was originally posted on LessWrong at https://www.lesswrong.com/posts/JrRWGQKjCiLRYXSjW/we-cannot-safely-automate-value-alignment-evaluation-and-1IntroductionAutomating alignment evaluation and research is thought to be an efficient way to safeguard against uncontrollable AGI, as Joe Carlsmith and Jan Leike themselves admitted. In particular, Leike proposed a Minimal Viable Product (MVP) for alignment, consisting in:“Building a suffi...

Read full article →

We cannot safely automate value alignment evaluation and research without thinking about delegation and discretion by Maria Federica Martino Lena

Related Articles