47 - David Rein on METR Time Horizons
YouTube link When METR says something like “Claude Opus 4.5 has a 50% time horizon of 4 hours and 50 minutes”, what does that mean? In this episode David Rein, METR researcher and co-author of the paper “Measuring AI ability to complete long tasks”, talks about METR’s work on measuring time horizons, the methodology behind those numbers, and what work remains to be done in this domain. Topics we discuss: Measuring AI Ability to Complete Long Tasks The meaning of “task length” Examples of interme...
Read full article →