Training Language Models for Controlled Stochasticity
Pseudo-random generation solved the problem of sampling from mathematical distributions on computers. Faced with a similar need in natural language, it feels natural to simply ask an LLM: "Name a random city" but this leads to the disappointing discovery that language models are heavily biased and suffer from mode collapse. For instance, when Qwen3 is asked to pick a random weekday, it chooses Wednesday about 80% of the time. Gemma-3, when naming cities, gives 75% of its answers as just four cit...
Read full article →