When does debate help a weak judge? Evidence from code and logic

·LessWrong··

Authors: Ethan Elasky and Frank Nakasako, Palaestra Research; Naman Goyal, Independent.ArXiv link: [will be here when available] (for now, preprint Google Drive link)Thanks to Coefficient Giving for support and Thinking Machines for API credits; our mentor for guidance along the way; and Julian Michael, Johannes Gasteiger, and Jiaxin Wen, among others, for helpful conversations. What we didThis is a writeup of experiments we ran on debate as a reward-labeling protocol. The basic question: if a w...

Read full article →

Related Articles

US bans differential privacy in Census data
nl · Hacker News · 2h ago
Arch Linux Now Believes Malware Incident Under Control: More Than 1,500 Packages
qwertox · Hacker News · 4h ago
Twenty One Zero-Days in FFmpeg
redbell · Hacker News · 18h ago
CRISPR tech selectively shreds cancer cells, including "undruggable" cancers
gmays · Hacker News · 1d ago
Kimi K2.7-Code: open-source coding model with better token efficiency
nekofneko · Hacker News · 1d ago