Story Detail of id 47685358 | Liveview Hacker News

DeathArrow10 hours ago | on: GLM-5.1: Towards Long-Horizon Tasks

It would be nice if you can test the model with different harnesses, Z.ai's own Z Code, Claude Code, Open Code, Pi, Cursor etc.

My impression is that the choice of harness matters a lot.

gertlabs9 hours ago | parent

Interesting idea. The metric I'd intuitively want to see is low variance between harnesses for a smarter model. But if a large sample of models statistically outperformed with a certain harness, that's indeed a valuable signal for a developer.

#visit	13,257,477
#session	74,665
#live-session	0