Story Detail of id 47683835 | Liveview Hacker News

zozbot23411 hours ago | on: GLM-5.1: Towards Long-Horizon Tasks

Coding assistants are currently quite hard to run locally with anything like SOTA abilities. Support in the most popular local inference frameworks is still extremely half-baked (e.g. no seamless offload for larger-than-RAM models; no support for tensor-parallel inference across multiple GPUs, or multiple interconnected machines) and until that improves reliably it's hard to propose spending money on uber-expensive hardware one might be unable to use effectively.

dash211 hours ago | parent | next

This is an argument against the grandparent's points (1) and (2), not their point (3).

zozbot23411 hours ago | root | parent

It's one clear argument for the (so get to work!) part.

sunir10 hours ago | parent

Computers get better and cheaper. That’s not a forever problem.

doctorwho428 hours ago | root | parent

Source?

GPU and RAM prices have definitely not made consumer PC's cheaper than they were before bitcoin blew up or before AI blew up.

Maybe you could make an argument that they are more cost efficient for the price point... But that's not the same as cheaper when every application or program is poorly optimized. For example why would a browser take up more than a GB or two of RAM?

And I'd postulate that R&D to develop localized AI is another example, the big players seem hellbent that there needs to be a most and it's data centers... The absolute opposite of optimization

#visit	13,255,827
#session	74,665
#live-session	0