Hacker News new | past | comments | ask | show | jobs | submit
Coding assistants are currently quite hard to run locally with anything like SOTA abilities. Support in the most popular local inference frameworks is still extremely half-baked (e.g. no seamless offload for larger-than-RAM models; no support for tensor-parallel inference across multiple GPUs, or multiple interconnected machines) and until that improves reliably it's hard to propose spending money on uber-expensive hardware one might be unable to use effectively.
This is an argument against the grandparent's points (1) and (2), not their point (3).
It's one clear argument for the (so get to work!) part.
Computers get better and cheaper. That’s not a forever problem.
Source?

GPU and RAM prices have definitely not made consumer PC's cheaper than they were before bitcoin blew up or before AI blew up.

Maybe you could make an argument that they are more cost efficient for the price point... But that's not the same as cheaper when every application or program is poorly optimized. For example why would a browser take up more than a GB or two of RAM?

And I'd postulate that R&D to develop localized AI is another example, the big players seem hellbent that there needs to be a most and it's data centers... The absolute opposite of optimization