There's going to be a limit to how much they can raise prices, because someone can always build out a datacenter and fill it up with open source DeepSeek inference and undercut your prices by 10x while still making a very good ROI--and that's a business model right there. Right now I'm sure there's a lot of people who will protest that they couldn't do their jobs with lesser models, but as time goes on that will get less and less. Already right now the consumers who are using AI for writing presentations, cooking recipe generation and ELI5 answers for common things, aren't going to be missing much from a lesser model. That'll actually only start to get cheaper over time.
Also for business needs, as AI inference costs escalate there comes a point where businesses rediscover human intelligence again, and start hiring/training people to do more work to use lesser models--if that is more productive in the end than shelling out large amounts of cash for inference on the latest models. [Although given how much companies waste on AWS, there's a lot of tolerance for overspending in corporations...]
Not sure how it all works out. Currently trillion dollar companies can't make a native app for platforms. Everything is just JS/Electron because economics does not work for them.
And here companies can make GW data center running very expensive GPUs for 1/10th of current prices. Sound little fanciful to me.
And at some point even frontier model costs will hopefully come down (if there is still a meaningful difference between closed and open source models at that point) as all of the compute that's being built out right now comes online.