Models Recall What They Violate: Constraint Adherence in Multi-Turn LLM Ideation
arXiv:2604.28031v1 Announce Type: cross Abstract: When researchers iteratively refine ideas with large language models, do the models preserve fidelity to the original objective? We introduce DriftBench, a benchmark for evaluating constraint adherence in multi-turn LLM-assisted scientific ideation. Across 2,146 scored benchmark runs spanning seven models from five providers (including two open-weight), four interaction conditions, and 38 research briefs from 24 scientific domains, we find that i...
Read full article →