Hacker News new | past | comments | ask | show | jobs | submit
Decent vs best-money-can-buy. Further, a self-hosted LLM will be much slower.
I think we're all past the "bet-money-can-buy" stage. The most expensive models are an order of magnitude more expensive than the middle ground ones, so you need to be selective about what you run where.

And with a bit of careful routing - there isn't a lot stopping you sending the hard stuff to a cloud model and the average stuff to an on prem model.

loading story #48397390