Hacker News new | past | comments | ask | show | jobs | submit
If LLMs turn out to be such a force multiplier, the way to fight it is to ensure that there are open source LLMs.
I think the issue is that LLMs are a cash problem as much as they are a technical problem. Consumer hardware architectures are still pretty unfriendly to running models which are actually competitive to useful models so if you want to even do inference on a model that's going to reliably give you decent results you're basically in enterprise territory. Unless you want to do it really slowly.

The issue that I see is that Nvidia etc. are incentivised to perpetuate that so the open source community gets the table scraps of distills, fine-tunes etc.

You got me thinking that what's going to happen is some GPU maker is going to offer a subsidized GPU (or RAM stick, or ...whatever) if the GPU can do calculations while your computer is idle, not unlike Folding@home. This way, the company can use the distributed fleet of customer computers to do large computations, while the customer gets a reasonably priced GPU again.
The kinds of GPUs that are in use in enterprise are 30-40k and require a ~10KW system. The challenge with lower power cards is that 30 1k cards are not as powerful, especially since usually you have a few of the enterprise cards in a single unit that can be joined efficiently via high bandwidth link. But even if someone else is paying the utility bill, what happens when the person you gave the card to just doesn’t run the software? Good luck getting your GPU back.
loading story #47445686
Open-source models will never be _truly_ competitive as long as obtaining quality datasets and training on them remains prohibitively expensive.

Plus, most users don't want to host their own models. Most users don't care that OpenAI, Anthropic and Google have a monopoly on LLMs. ChatGPT is a household name, and most of the big businesses are forcing Copilot and/or Claude onto their employees for "real work."

This is "everyone will have an email server/web server/Diaspora node/lemmy instance/Mastodon server" all over again.

The problem is even if an OSS had the resources (massive data centers the size of NYC packed with top end custom GPU kits) to produce the weights, you need enormous VRAM laden farms of GPUs to do inference on a model like Opus 4.6. Unless the very math of frontier LLMs changes, don’t expect frontier OSS on par to be practical.
I feel like you're overstating the resources required by a couple orders of magnitude. You do need a GPU farm to do training, but probably only $100M, maybe $1B of GPUs. And yes, that's a lot of GPUs, but they will fit in a single datacenter, and even in dollar terms, there are many individual buildings in NYC that are cheaper.
> you need enormous VRAM laden farms of GPUs to do inference on a model like Opus 4.6.

It's probably a trade secret, but what's the actual per-user resource requirement to run the model?

{"deleted":true,"id":47440564,"parent":47440347,"time":1773932097,"type":"comment"}
There's already an ecosystem of essentially undifferentiated infrastructure providers that sell cheap inference of open weights models that have pretty tight margins.

If the open weights models are good, there are people looking to sell commodity access to it, much like a cloud provider selling you compute.

That would be accepting the framing of your class enemy, there is no reason to do that.
unless they are also pirate LLMs, I don't see how any open source project could have pockets deep enough for the datacenters needed to seriously contend