Aligned Agents Still Build Misaligned Organisations

Rohit Krishnan·Strange Loop Canon·Metascience·April 24, 2026

By now, we have plenty of examples of AI agent misalignment. They lie, they sometimes cheat, they break rules, they demonstrate odd preferences for self-preservation or against self-preservation. They reward hack! Quite a bit has been studied about them and much of these faults have been ameliorated enough that we use them all the time.But we’re starting to go beyond a single agent. We’re setting up multi-agent workflows. Agents are working with other agents, autonomously or semi-autonomously, t...

Read full article →

Aligned Agents Still Build Misaligned Organisations

Related Articles