What am I, if not an AI?

makiba·LessWrong·Community·May 21, 2026

TL:DRI RL fine-tuned Mistral 7B Instruct v0.3 and Llama 3.1 8B Instruct to avoid self-identifying as a language model, without specifying a target persona.Mistral converged on a single recurring persona (Catholic American woman) across most runs. Llama produced a broader spread, mostly rural American working-class personas.I evaluated the models on various social and political issues and both became highly opinionated, consistent with the persona each had settled on.SetupThe goal of the experime...

Read full article →

What am I, if not an AI?

Related Articles