GLM 5.2 playing text adventures
I’ve heard some buzz around the new GLM 5.2 open-weights model. They say it’s very capable! I won’t run a full comparison benchmark, but I have some credits sloshing around on OpenRouter so I figured I might compare GLM 5.2 to the similarly-priced Gemini 3 Flash[1], and see where things land.This uses the same setup as the previous benchmark: each LLM gets a few attempts at playing the game, with each attempt being limited to a fixed budget of around $0.15. The LLM doesn’t know it, but the harne...
Read full article →