Toward a Better Evaluations Ecosystem

·LessWrong··

Model evaluations are broken. Numbers that are often cited alongside one another as evidence of progress are rarely comparable due to inconsistent methodologies, and AI companies run and report internal evals that are unavailable to the wider community. But we can fix this.We are making deployment and safety decisions based on numbers that do not mean what people think they mean. Every other high-stakes industry has solved this the same way, by taking the measurements out of the hands of the com...

Read full article →

Related Articles

Probing the loss-band sparsity assumption in Scientist AI
Alejandro Tlaie · LessWrong · 53m ago
SFT Drives Gemini’s Safety Properties
Josh Engels · Alignment Forum · 22d ago
Sequent: scale and automation for higher confidence in alignment
Geoffrey Irving · Alignment Forum · 25d ago
A Mike's-Eye View of ARC's Research
Michael Winer · ARC · 25d ago
My research: a computational cognitive neuroscience perspective on alignment
Seth Herd · Alignment Forum · 1mo ago