Should we train LLMs to be human?

Hubert Plisiecki·LessWrong·Community·May 27, 2026

A recent piece of research on human behavioral alignment shows that post-training leads to LLMs becoming less human-like in their responses (Binz et al., 2026). The obvious follow up question is whether this drift is intended, and whether it is optimal. From the broader perspective of goal-alignment this tendency could lower the ability of LLM models to model human behavior, and by proxy understand our needs and wants. On the other hand, it could be argued that some parts of human behavior are s...

Read full article →

Should we train LLMs to be human?

Related Articles