Optimisation over non-stationary distributions creates weirder minds

Samuel Ratnam·LessWrong·Community·June 6, 2026

TLDR: Sequentially mixing training objectives incentivises different training dynamics depending on the distinguishability of the training environments and the amount of pressure for shared circuitry. We classify these patterns into three classes: ecological generalists, conditional policies, and strategy churn. We suggest that careful consideration of the pressures of non-stationary training dynamics can allow us to shape the minds of AI systems in more intentional and fine-grained ways.Modern ...

Read full article →

Optimisation over non-stationary distributions creates weirder minds

Related Articles