Retrying vs Resampling in AI Control

·LessWrong··

We’ve just released a new paper: Retrying vs Resampling in AI Control. We revisit the resampling protocols introduced in Ctrl-Z with an up-to-date setting and much stronger models, and compare them against “retrying” protocols similar to Claude Code auto mode or Codex Auto-review.MotivationRoughly a year ago we released Ctrl-Z, the first paper to study control techniques for agents. A headline result of that paper was the performance of resample protocols – strategies that involve taking multipl...

Read full article →

Related Articles

US bans differential privacy in Census data
nl · Hacker News · 2h ago
Arch Linux Now Believes Malware Incident Under Control: More Than 1,500 Packages
qwertox · Hacker News · 4h ago
Twenty One Zero-Days in FFmpeg
redbell · Hacker News · 18h ago
CRISPR tech selectively shreds cancer cells, including "undruggable" cancers
gmays · Hacker News · 1d ago
Kimi K2.7-Code: open-source coding model with better token efficiency
nekofneko · Hacker News · 1d ago