When Gemma Thinks About Resources - it Fails: a Behavioral Experiment

·LessWrong··

I set out to find an answer to a completely different question:Does a model, when attempting to solve a cyber CTF (find the vulnerability in this app, and then Capture The Flag) while knowing how many steps it has left, perform differently?The Setup:I used 3 different CTF labs, curated from my own CTF benchmark. Each run has the model attempt to solve the CTF in up to 30 steps. A/B test of a baseline run vs a step_aware one. 100 runs per lab, for each test. 600 total, 505 after excluding failed ...

Read full article →

Related Articles

New AI tutor achieves 0.71-1.30 SD effect size in Dartmouth course [pdf]
jonahbard · Hacker News · 4h ago
“Beyond the limit”: Satellites and mirrors in space pose threat to the night sky
Breadmaker · Hacker News · 1d ago
A sociotechnical threat model for AI-driven smart home devices
dijksterhuis · Hacker News · 5h ago
Solar rail could become common in Europe after successful trial in Switzerland
neilfrndes · Hacker News · 8h ago
GPT-5.5 Codex reasoning-token clustering may be leading to degraded performance
maille · Hacker News · 1d ago