Hacker News new | past | comments | ask | show | jobs | submit
I think the issue is that LLMs are a cash problem as much as they are a technical problem. Consumer hardware architectures are still pretty unfriendly to running models which are actually competitive to useful models so if you want to even do inference on a model that's going to reliably give you decent results you're basically in enterprise territory. Unless you want to do it really slowly.

The issue that I see is that Nvidia etc. are incentivised to perpetuate that so the open source community gets the table scraps of distills, fine-tunes etc.

You got me thinking that what's going to happen is some GPU maker is going to offer a subsidized GPU (or RAM stick, or ...whatever) if the GPU can do calculations while your computer is idle, not unlike Folding@home. This way, the company can use the distributed fleet of customer computers to do large computations, while the customer gets a reasonably priced GPU again.
loading story #47441201