Story Detail of id 47679222 | Liveview Hacker News

johnfn19 hours ago | on: GLM-5.1: Towards Long-Horizon Tasks

GLM-5.0 is the real deal as far as open source models go. In our internal benchmarks it consistently outperforms other open source models, and was on par with things like GPT-5.2. Note that we don't use it for coding - we use it for more fuzzy tasks.

sourcecodeplz18 hours ago | parent | next

Yep, haven't tried 5.1 but for my PHP coding, GLM-5 is 99% the same as Sonnet/Opus/GPT-5 levels. It is unbelievably strong for what it costs, not to mention you can run it locally.

deepsquirrelnet18 hours ago | parent | next

I am working on a large scale dataset for producing agent traces for Python <> cython conversion with tooling, and it is second only to gemini pro 3.1 in acceptance rates (16% vs 26%).

Mid-sized models like gpt-oss minimax and qwen3.5 122b are around 6%, and gemma4 31b around 7% (but much slower).

I haven’t tried Opus or ChatGPT due to high costs on openrouter for this application.

foopod10 hours ago | parent | next

It really bothers me that people refer to open weight models as being open source. They fundamentally aren't and are more akin to freeware than anything else.

epolanski17 hours ago | parent

Same thing I noticed.

My use cases are not code editing or authoring related, but when it comes to understanding a codebase and it's docs to help stakeholders write tasks or understand systems it has always outperformed american models at roughly half the price.

#visit	13,255,817
#session	74,665
#live-session	0