Story Detail of id 48395148 | Liveview Hacker News

oblio16 hours ago | on: Uber's $1,500/month AI limit is a useful signal for AI tool pricing

Infrastructure is massively complex and multi cloud is super hard to do. Switching LLMs is... a drop down.

Now, that doesn't mean running your own LLM will be easy, but this will mean it's a lot more likely that there will be at least regional LLMs, in my opinion. I.e. there will be Google, whichever (if any) is left standing of OpenAI or Anthropic, and then there will be Chinese hosted LLMs, probably Indian hosted LLMs, European hosted LLMs, plus LLMs hosted on managed services (i.e. Bedrock). For sure I see large banks on the like being able to host the best OSS or even licensed LLMs on their own cloud infrastructure accounts (i.e. at AWS, Azure, etc).

And that's on top of the LLMs running on owned server infrastructure plus actual local, on device LLMs.

jujube38 hours ago | parent

You're using the future tense, but all of those things already exist. Google exists, Amazon Bedrock exists, DeepSeek's cloud product exists, etc. etc. But this isn't relevant to what the post you are replying to said, which is that "cloud-based, metered AI being a dominant work mode [is a] fad". Since all of those things are cloud-based, metered AI.

dofm7 hours ago | root | parent

I was talking more about on-premises, on private cloud and on-device stuff, as I said.

If you look at what Uber is spending per developer per month, they clearly have some headroom to consider whether more-local, unmetered AI tools on device, on premises, in private cloud, can be cost-effectively used to cut down how much money they are pouring into Anthropic and OpenAI. Not least because a bit of centralised effort might lead them to distilled models that are better for their purposes. Some of that budget could go into simply putting a bit more capacity on a developer's desk.

Can they do it now for everything? Obviously not. But IMO there is no reason at all for planning and scaffolding tasks to be done with cloud models, and there are many reasons why it might be better to do document processing without leaving the premises.

The incentives are there on the technical, operations and particularly on the business levels, and the relative disruption of the switch really small, considering that all the tooling can use different models for different tasks already. They must at least be investigating the possibility; it's irresponsible not to.

jujube331 minutes ago | root | parent

Uber's not really a good example because they deliberately incentivized their engineers to spend as many tokens as possible, which was silly. But even assuming that every developer uses the full $1,500 a month of tokens that they are now allowing, that's actually not a lot of money relative to the cost of a single developer for them. It's less than 1/10 of a junior engineer's fully loaded salary.

Where I would expect to see people invest in local models is in cases where a company has regulatory requirements to keep data local, or where they're doing some specialized kind of work. Neither of those really apply to Uber. In 2026, it would absolutely be irresponsible for a taxi company to try to build a better Claude than Claude.

Now what this looks like 5 or 10 years from now, it's hard to say. A lot will depend on whether China keeps releasing open weight models and whether people can still run those open weight models on commercially available hardware.

#visit	13,572,587
#session	74,665
#live-session	0