Story Detail of id 47682874 | Liveview Hacker News

kcb13 hours ago | on: GLM-5.1: Towards Long-Horizon Tasks

What benefit is there to dropping $50k on GPUs to run this personally besides being a cool enthusiast project?

marcus_holmes12 hours ago | parent | next

Why would anyone need more than 640Kb of memory?

Exactly the point though. In the 640KB days there was no subscription to ever increasing compute resources as an alternative.

loading story #47684363

deminature13 hours ago | parent | next

Intel has just released a high VRAM card which allows you to have 128GB of VRAM for $4k. The prices are dropping rapidly. The local models aren't adapted to work on this setup yet, so performance is disappointing. But highly capable local models are becoming increasingly realistic. https://www.youtube.com/watch?v=RcIWhm16ouQ

kcb11 hours ago | root | parent

That's 4 32GB GPUs with 600GB/s bw each. This model is not running on that scale GPUs. I think something like 96GB RTX PRO 6000 Blackwells would be the minimum to run a model of this size with performance in the range of subscription models.

loading story #47685221

blizdiddy13 hours ago | parent | next

Is it so hard to project out a couple product cycles? Computers get better. We’ve gone from $50k workstation to commodity hardware before several times

kcb11 hours ago | root | parent

Subscription services get all the same benefits from computer hardware getting better. But actually due to scale, batching, resource utilization, they'll always be able to take more advantage of that.

CamperBob212 hours ago | parent | next

It will run exactly the same tomorrow, and the next day, and the day after that, and 10 years from now. It will be just as smart as the day you downloaded the weights. It won't stop working, exhaust your token quota, or get any worse.

That's a valuable guarantee. So valuable, in fact, that you won't get it from Anthropic, OpenAI, or Google at any price.

kcb10 hours ago | root | parent

That's why we all still use our e machines its never obsolete PCs. Works just the same it did 20 years ago, though probably not because I've never heard of hardware that's guaranteed not to fail.

fwipsy11 hours ago | parent

Agree directionally but you don't need $50k. $5k is plenty, $2-3k arguably the sweet spot.

unlikelytomato11 hours ago | root | parent | next

as a local LLM novice, do you have any recommended reading to bootstrap me on selecting hardware? It has been quite confusing bring a latecomer to this game. Googling yields me a lot of outdated info.

loading story #47684866

kcb11 hours ago | root | parent

The 4-bit quants are 350GB, what hardware are you talking about?

loading story #47684728

#visit	13,255,753
#session	74,665
#live-session	0