The issue that I see is that Nvidia etc. are incentivised to perpetuate that so the open source community gets the table scraps of distills, fine-tunes etc.
Plus, most users don't want to host their own models. Most users don't care that OpenAI, Anthropic and Google have a monopoly on LLMs. ChatGPT is a household name, and most of the big businesses are forcing Copilot and/or Claude onto their employees for "real work."
This is "everyone will have an email server/web server/Diaspora node/lemmy instance/Mastodon server" all over again.
It's probably a trade secret, but what's the actual per-user resource requirement to run the model?
If the open weights models are good, there are people looking to sell commodity access to it, much like a cloud provider selling you compute.