We cannot safely automate value alignment evaluation and research without thinking about delegation and discretion by Maria Federica Martino Lena

·Nuno Sempere··

This es­s­say was origi­nally posted on LessWrong at https://​​www.less­wrong.com/​​posts/​​JrRWGQKjCiLRYXSjW/​​we-can­not-safely-au­to­mate-value-al­ign­ment-eval­u­a­tion-and-1IntroductionAu­tomat­ing al­ign­ment eval­u­a­tion and re­search is thought to be an effi­cient way to safe­guard against un­con­trol­lable AGI, as Joe Car­l­smith and Jan Leike them­selves ad­mit­ted. In par­tic­u­lar, Leike pro­posed a Min­i­mal Vi­able Product (MVP) for al­ign­ment, con­sist­ing in:“Build­ing a suffi­...

Read full article →

Related Articles

will an AI get a nobel prize before 2040?
Cimorene Blume · Manifold Markets · 20h ago
How to Solve AI Biosecurity by Sophie Kim
Sophie Kim · Nuno Sempere · 1d ago
The Learning Trap: What Simulated Clueless Agents Reveal About the Unawareness Argument by dan.pandori 🔸
dan.pandori 🔸 · Nuno Sempere · 1d ago
To minimize the overall amount of suffering one causes, is it better to eat an organic Vegan diet, or eat a conventional Vegan diet and donate the money saved? Measured in neuron deaths of conscious animals and using AI estimates. Comparisons to non-Vegan diets included. by PreciousPig
PreciousPig · Nuno Sempere · 2d ago
Animal disenhancement by weganskie_miaso
weganskie_miaso · Nuno Sempere · 3d ago