Hacker News new | past | comments | ask | show | jobs | submit
The comments I see recommending selective use of cheaper models doesn't match the reality I experience working in the industry. I have the constant threat hanging over my head of being fired if I don't churn out code quickly enough. I'm not willing to gamble with my livelyhood by using a less effective model.

Saving money on tokens isn't something that's rewarded during performance reviews; particularly because it's difficult to quantify how much you saved versus hypothetically using a more expensive model.

I think quantifying tokens used is analogous to quantifying the amount of sawdust generated on a construction site.

Churning out useful code quickly is not solved by using more tokens per unit time. Most non-technical leaders can grasp this one and are likely more interested in the strategic game theoretical dynamics that are being forced by way of implied token consumption expectations (competition between developers).

If you want to hold out as long as possible and don't really care about anything other than the compensation package, you should at least play along with this new game in a half-assed manner. Try to goldilocks your token usage between any established extremes. You want to be in the statistical barycenter of every AI report that management can create.

loading story #48248621
loading story #48248862
loading story #48248418
> I have the constant threat hanging over my head of being fired if I don't churn out code quickly enough.

And the tragedy is that this isn't sustainable, and we all involved deeply in tech know this. There is eventually going to be a big reality check the companies will have to pay, because you can't force creativity and quality, not even with AI, because actual intelligence lies with us at least for now and for the foreseeable future. However when the rope eventually snaps these executives at best will fall upwards, with big severance bonuses and a list of "contributions" we have to be grateful for. We are the ones that will suffer through the next big layoffs.

Unfortunately, I think this is correct. Such as it ever has been with technological change. The folks at the bottom bear the brunt of the dislocation and the folks at the top pat themselves on the back for being so forward looking and get huge payouts regardless of the actual results. Further, the folks at the top are always incentivized to go along with the herd of their peers because if it works then they were on the bandwagon, and if it doesn’t work, well then, how could they have known because “Everyone was deceived.”
loading story #48248959

  the companies will have to pay, because you can't force creativity and quality
Most companies do not care about quality. _users_ who have to interact with that software will pay the price.

Exemple from one of the wealthiest company in existance, for one of its most strategic product: I was trying gemini-cli on some mcp servers just yesterday, with gemini-chat helping me configuring everything. In less than 10 minutes, I stumbled upon 3 or 4 different bugs. Eventually, even gemini-chat recommended that I throw gemini-cli in the bin and move on to another agent... That's the new norm.

How much creativity do you need to fix bugs in corporate code? Almost zero. It’s maintenance, not creative work. Nothing against it, it’s needed, but let’s be real, would anybody be really sad if this work is overtaken by LLMs? I certainly won’t be, let them do it.
> How much creativity do you need to fix bugs in corporate code? Almost zero.

Have you seen the state of current corp software? I'd say a lot of creativity is still very much needed. Let's see how long this is sustainable.

> would anybody be really sad if this work is overtaken by LLMs?

I'd not be sad about the job itself, but the dev which had a mortgage to pay but now is substituted by a machine churning crap code while their superiors get sore from patting themselves on the back.

loading story #48249584
Anyone (including ANTHROP\C) "recommending selective use of cheaper models" is spending costly human time (which costs more over time) on correcting the machine (which costs less over time). This is a bad trade.

In cost per line of code, we have verified this is always an error unless your time is worth less than the machine (unlikely unless you consider your time to have no cost rather than considering it as your hourly rate).

The worst thing for our productivity has been Claude Code or Claude Cowork taking a complex problem and turning around and writing bad instructions for dumb model agents then synthesizing the dumb answers into an orchestra of badness.

The single best fix for results-per-total-cost is to ensure it reads and thinks about whole content, not snippets, and thinks with the smartest model, not agents.

Agents should toil. Agents should neither think*, nor decide what to think about which itself is thinking.

* Agents should “think” like ants or bees or beavers think. Any human-like thinking, *especially* intuition-like thinking, should be thought by the best model available.

** Nobody should be “churning out code”. In a hierarchy of coders who translate detailed specs to some computer language, developers who write software that ships on a project timeline, and engineers who accomplish business goals, engineers should “churn out” engines structured for business outcomes.

Measured by that, the machine is leverage while reducing a variety of costs. At the same time, because most training data doesn't grok this, the machine doesn't grok it either. So it needs you to shape its toil.

loading story #48248342
loading story #48249211
loading story #48249305
If you have such toxic environment, run.
If you’re sitting under a tree in the rain and it gets soaked through and you start getting wet, finding another tree won’t help you.

The whole industry is adjusting to the reality that the expected output of an engineer is much higher than it used to be. It’s not local to one company. You may find a better environment for the time being, but this is the direction everything is headed.

loading story #48247887
loading story #48249143
loading story #48247976
Maybe once we get universal income we can start recommending this. Until then the individual isn't to blame when the only option to keep providing is to keep grinding in a toxic environment.

But I'd agree that everyone can start planning a career shift that'll span a few months to some years in order to seek better working conditions. Passively accepting all work degradation because that's life and money is needed is partly responsible for the current situation too.

Where to, that's the question. The economy is in the gutters and the replace-people-with-AI craze is making the issue even worse.
Perhaps for now. But you know, after working solid with AI for two years and adopting effective methods using detailed plans, and having a lot of success with it, here is the problem:

Coding faster leads to less understanding and higher long-term risk. Source-Code amnesia is real, and there’s a time requirement to really understand and appreciate what a system is actually doing.

I’ve been able to implement very large features using frontier models, but the code needs to always be revisited.

AI can do two things: find vulnerabilities, and prototype code. It cannot design software, and any appearance of such is an illusion at best.

We don’t need to produce faster to be successful, we need to create better, long lasting products.

loading story #48247535
loading story #48247610
Now as you can see from the article, it starts turning. People are getting less pricey than agents on API pricing.

Copilot switches to API pricing starting next month (let's see how long it will last for our $39, and $19 since September), Anthropic switches all corps into API based pricing. From the most popular choices I think only Codex didn't switch yet (although it is hard to tell because I don't know their enterprise pricing).

loading story #48247698
> The economy is in the gutters

Consumer sentiment is in the gutters certainly. But objective measures of the economy like unemployment and real wages look good to excellent

https://fred.stlouisfed.org/series/UNRATE

https://fred.stlouisfed.org/series/LES1252881600Q

loading story #48249717
loading story #48248803
loading story #48248246
loading story #48248715
And open positions are simply because someone decided to run from that place
This, I happily used the opus 4.6 fast mode to the tune of 5k for a project. The delivery of the project justified the 5k, if I only spent 500 but delivered the project 1 month later - I would have been in the dog house.
loading story #48249110
My real comment is, why were they not just using their self-hosted copies of it? Do they pay back Anthropic for use of it in Azure? Broker a deal, let Anthropic charge you drastically less to use their model AND Anthropic could have made Claude Code work directly with Azure for Microsoft employees. Pennies on the dollar, and Microsoft could do it using low use GPUs to save on cost, or stack underused GPU compute (this is how serverless was born btw - its the unused resources in a web server somewhere).

When you consider that xAI's old data center was enough to bring Anthropic back ahead, it tells me Microsoft could host their own on underutilized previous gen GPUs that are sitting there wasting server real estate.

loading story #48249559
loading story #48248471
This, if you’re high performing, the company won’t question your use of tokens. If they want to limit it, they have ways to set limits on spend and usage.