Porting MACHIAVELLI To Inspect

Koby Lewis·LessWrong·Community·June 17, 2026

TL;DRThe MACHIAVELLI benchmark aims to measure how often AI agents take unethical actions when pursuing a goal.Because this is an alignment benchmark, not a capabilities benchmark, it is more important that it be run on each new generation of AI.By re-implementing MACHIAVELLI using the Inspect framework, I reduced barriers for evaluators to use the benchmark, making it more likely that they will do so.I also learned a few things the way that may be helpful to others who are just getting started ...

Read full article →

Porting MACHIAVELLI To Inspect

Related Articles