GLM 5.2 playing text adventures

kqr·LessWrong·Community·June 18, 2026

I’ve heard some buzz around the new GLM 5.2 open-weights model. They say it’s very capable! I won’t run a full comparison benchmark, but I have some credits sloshing around on OpenRouter so I figured I might compare GLM 5.2 to the similarly-priced Gemini 3 Flash[1], and see where things land.This uses the same setup as the previous benchmark: each LLM gets a few attempts at playing the game, with each attempt being limited to a fixed budget of around $0.15. The LLM doesn’t know it, but the harne...

Read full article →

GLM 5.2 playing text adventures

Related Articles