Story Detail of id 47678480 | Liveview Hacker News

RickHull20 hours ago | on: GLM-5.1: Towards Long-Horizon Tasks

I am on their "Coding Lite" plan, which I got a lot of use out of for a few months, but it has been seriously gimped now. Obvious quantization issues, going in circles, flipping from X to !X, injecting chinese characters. It is useless now for any serious coding work.

unicornfinder19 hours ago | parent | next

I'm on their pro plan and I respectfully disagree - it's genuinely excellent with GLM 5.1 so long as you remember to /compact once it hits around 100k tokens. At that point it's pretty much broken and entirely unusable, but if you keep context under about 100k it's genuinely on par with Opus for me, and in some ways it's arguably better.

airstrike19 hours ago | root | parent | next

100k tokens it's basically nothing these days. Claude Opus 4.6M with 1M context windows is just a different ball game

loading story #47679776

loading story #47680271

loading story #47679430

loading story #47679368

loading story #47679701

loading story #47679405

loading story #47679168

loading story #47679394

loading story #47680776

kay_o19 hours ago | root | parent

Is manual compation absolutely mandatory ?

loading story #47687039

loading story #47679113

kay_o20 hours ago | parent | next

I am on the mid tier Coding plan to trying it out for the sake of curiosity.

During off peak hour a simple 3 line CSS change took over 50 minutes and it routinely times out mid-tool and leaves dangling XML and tool uses everywhere, overwriting files badly or patching duplicate lines into files

harias17 hours ago | root | parent

Off peak for China or US

loading story #47680452

InsideOutSanta17 hours ago | parent | next

My impression is that different users get vastly different service, possibly based on location. I live in Western Europe, and it works perfectly for me. Never had a single timeout or noticeable quality degradation. My brother lives in East Asia, and it's unusable for him. Some days, it just literally does not work, no API calls are successful. Other days, it's slow or seems dumber than it should be.

kay_o9 hours ago | root | parent | next

It's now mid weekday in China timezone.

Starting an hour or two ago GLM's API endpoint is failing 7/8 times for me, my editor is retrying every request with backoff over a dozen times before it succeeds and wildly simple changes are taking over 30 minutes per step.

csomar8 hours ago | root | parent

Their distribution operation is very bad right now. The model is pretty decent when it works but they have lots of issues serving the people. That being said, I have had the same problems with Gemini (even worse in the last two weeks) and Claude. So it seems to be the norm in the industry.

satvikpendem19 hours ago | parent | next

Every model seems that way, going back to even GPT 3 and 4, the company comes out with a very impressive model that then regresses over a few months as the company tries to rein in inference costs through quantization and other methods.

wolttam19 hours ago | parent | next

This is surprising to me. Maybe because I'm on Pro, and not Lite. I signed up last week and managed to get a ton of good work done with 5.1. I think I did run into the odd quantization quirk, but overall: $30 well spent

Mashimo19 hours ago | parent | next

I'm also on the lite plan and have been using 5.1 for a few days now. It works fine for me.

But it's all casual side projects.

Edit: I often to /compact at around 100 000 token or switch to a new session. Maybe that is why.

LaurensBER19 hours ago | parent | next

I'm on their lite plan as well and I've been using it for my OpenClaw. It had some issues but it also one-shotted a very impressive dashboard for my Twitter bookmarks.

For the price this is a pretty damn impressive model.

cmrdporcupine18 hours ago | parent | next

Is there any advantage to their fixed payment plans at all vs just using this model via 3rd party providers via openrouter, given how relatively cheap they tend to be on a per-token basis?

Providers like DeepInfra are already giving access to 5.1 https://deepinfra.com/zai-org/GLM-5.1

$1.40 in $4.40 out $0.26 cached

/ 1M tokens

That's more expensive than other models, but not terrible, and will go down over time, and is far far cheaper than Opus or Sonnet or GPT.

I haven't had any bad luck with DeepInfra in particular with quantization or rate limiting. But I've only heard bad things about people who used z.ai directly.

Lalabadie13 hours ago | root | parent

I use GLM 5 Turbo sporadically for a client, and my Openrouter expense might climb over a dollar per day if I insist. At about 20 work days per month it's an easy choice.

benterix19 hours ago | parent | next

> Obvious quantization issues

Devil's advocate: why shouldn't they do it if OpenAI, Anthropic and Google get away with playing this game?

cmrdporcupine17 hours ago | root | parent

I think what Anthropic is doing is more subtle. It's less about quantizing and more about depth of thinking. They control it on their end and they're dynamically fiddling with those knobs.

csomar8 hours ago | parent | next

I have their most expensive plan and it's on-par and sometimes better than Claude although you have to keep context short. That being said, the quota is no longer generous. It's still priced below Claude but not by that much. (compared to a few months ago where your money gets you x10 in tokens)

esafak19 hours ago | parent | next

I'm on their Lite plan and I see some of this too. It is also slow. I use it as a backup.

margorczynski19 hours ago | parent

It has been useless for long time when compared to Opus or even something like Kimi. The saving grace was that it was dirt cheap but that doesn't matter if it can't do what I want even after many repeated tries and trying to push it to a correct solution.

#visit	13,255,779
#session	74,665
#live-session	0