When does debate help a weak judge? Evidence from code and logic

ethanelasky·LessWrong·Community·May 26, 2026

Authors: Ethan Elasky and Frank Nakasako, Palaestra Research; Naman Goyal, Independent.ArXiv link: [will be here when available] (for now, preprint Google Drive link)Thanks to Coefficient Giving for support and Thinking Machines for API credits; our mentor for guidance along the way; and Julian Michael, Johannes Gasteiger, and Jiaxin Wen, among others, for helpful conversations. What we didThis is a writeup of experiments we ran on debate as a reward-labeling protocol. The basic question: if a w...

Read full article →

When does debate help a weak judge? Evidence from code and logic

Related Articles