Hacker News new | past | comments | ask | show | jobs | submit
I find LLMs so much more exhausting than manual coding. It’s interesting. I think you quickly bump into how much a single human can feasibly keep track of pretty fast with modern LLMs.

I assume until LLMs are 100% better than humans in all cases, as long as I have to be in the loop there will be a pretty hard upper bound on what I can do and it seems like we’ve roughly hit that limit.

Funny enough, I get this feeling with a lot of modern technology. iPhones, all the modern messaging apps, etc make it much too easy to fragment your attention across a million different things. It’s draining. Much more draining than the old days

loading story #47397696
> I find LLMs so much more exhausting than manual coding

I do as well, so totally know what you're talking about. There's part of me that thinks it will become less exhausting with time and practice.

In high school and college I worked at this Italian place that did dine in, togo, and delivery orders. I got hired as a delivery driver and loved it. A couple years in there was a spell where they had really high turnover so the owners asked me to be a waiter for a little while. The first couple months I found the small talk and the need to always be "on" absolutely exhausting, but overtime I found my routine and it became less exhausting. I definitely loved being a delivery driver far more, but eventually I did hit a point where I didn't feel completely drained after every shift of waiting tables.

I can't help but think coding with LLMs will follow a similar pattern. I don't think I'll ever like it more than writing the code myself, but I have to believe at some point I'll have done it enough that it doesn't feel completely draining.

I think it's because traditionally, software engineering was a field where you built your own primitives, then composited those, etc... so that the entire flow of data was something that you had a mental model for, and when there was a bug, you simply sat down and fixed the bug.

With the rise of open source, there started to be more black-box compositing, you grabbed some big libraries like Django or NumPy and honestly just hoped there weren't any bugs, but if there were, you could plausibly step through the debugger and figure out what was going wrong and file a bug report.

Now, the LLMs are generating so many orders of magnitude more code than any human could ever have the chance to debug, you're basically just firing this stuff out like a firehose on a house fire, giving it as much control as you can muster but really just trusting the raw power of the thing to get the job done. And, bafflingly, it works pretty well, except in those cases where it doesn't, so you can't stop using the tool but you can't really ever get comfortable with it either.

loading story #47396450
> I think it's because traditionally, software engineering was a field where you built your own primitives, then composited those, etc... so that the entire flow of data was something that you had a mental model for

Not just that, but the fact that with programming languages you can have the utmost precision to describe _how_ the problem needs to be solved _and_ you can have some degree of certainty that your directions (code) will be followed accurately.

It’s maddening to go from that to using natural language which is interpreted by a non-deterministic entity. And then having to endlessly iterate on the results with some variation of “no, do it better” or, even worse, some clever “pattern” of directing multiple agents to check each other’s work, which you’ll have to check as well eventually.

> bafflingly, it works pretty well, except in those cases where it doesn't

so as a human, you would make the judgement that the cases where it works well enough is more than make up for the mistakes. Comfort is a mental state, and can be easily defeated by separating your own identity and ego with the output you create.

loading story #47396311
loading story #47397322
Thanks for the story. I also spent time as a delivery driver at an italian restaurant. It was a blast in the sense that i look back at that slice of life with pride and becoming. Never got the chance to be a waiter, but definitely they were characters and worked hard for their money. Also the cooking staff. What a hoot.
loading story #47403570
I think the upper limit is your ability to decide what to build among infinite possibilities. How should it work, what should it be like to use it, what makes the most sense, etc.

The code part is trivial and a waste of time in some ways compared to time spent making decisions about what to build. And sometimes even a procrastination to avoid thinking about what to build, like how people who polish their game engine (easy) to avoid putting in the work to plan a fun game (hard).

The more clarity you have about what you’re building, then the larger blocks of work you can delegate / outsource.

So I think one overwhelming part of LLMs is that you don’t get the downtime of working on implementation since that’s now trivial; you are stuck doing the hard part of steering and planning. But that’s also a good thing.

I've found writing the code massively helps your understanding of the problem and what you actually need or want. Most times I go into a task with a certain idea of how it should work, and then reevaluate having started. While an LLM will just do what you ask without questing, leaving you with none of the learnings you would have gained having done it. The LLM certainly didn't learn or remember anything from it.
In some cases, yes. But I’ve been doing this awhile now and there is a lot of code that has to be written that I will not learn anything from. And now, I have a choice to not write it.
Ehh, I find that the most tedious code is also the most sensitive to errors, stuff that blurs the divide between code and data.
I doubt if we're talking about the same sort of things at all. I'm talking about stuff like generic web crud. Too custom to be generated deterministically but recent models crush it and make fewer errors than I do. But that is not even all they can do. But yes, once you get into a large complicated code base its not always worth it, but even there one benefit is it to develop more test cases - and more complicated ones - than I would realistically bother with.
I actually like writing the tedious code by hand.

The whole time I'm doing it, I'm trying to think of better ways. I'm thinking of libraries, utilities or even frameworks I could create to reduce the tedium.

This is actually one of the things I dislike the most about LLM coding: they have no problem with tedium and will happily generate tens of thousands of lines where a much better approach could exist.

I think it's an innovation killer. Would any of the ORMs or frameworks we have today exist if we'd had LLMs this whole time?

I doubt it.

It depends on how you use them. In my workflow, I work with the LLM to get the desired result, and I'm familiar with the system architecture without writing any of the code.

I've written it up here, including the transcript of an actual real session:

https://www.stavros.io/posts/how-i-write-software-with-llms/

Thanks for writing this up.

I just woke up recently myself and found out these tools were actually becoming really, really good. I use a similar prompt system, but not as much focus on review - I've found the review bots to be really good already but it is more efficient to work locally.

One question I have since you mention using lots of different models - is do you ever have to tweak prompts for a specific model, or are these things pretty universal?

I don't tweak prompts, no. I find that there's not much need to, the models understand my instructions well enough. I think we're way past the prompt engineering days, all models are very good at following instructions nowadays.
Right when you're coding with LLM it's not you asking the LLM questions, it's LLM asking you questions, about what to build, how should it work exactly, should it do this or that under what conditions. Because the LLM does the coding, it's you have to do more thinking. :-)

And when you make the decisions it is you who is responsible for them. Whereas if you just do the coding the decisions about the code are left largely to you nobody much sees them, only how they affect the outcome. Whereas now the LLM is in that role, responsible only for what the code does not how it does it.

Hehe, speak for yourself- as a 1x coder on a good day, having a nonjudgmental partner who can explain stuff to me is one of the best parts of writing with an llm :)
I like that aspect of it too. LLM never seems to get offended even when I tell it its wrong. Just trying to understand why some people say it can feel exhausting. Instead of focusing on narrowly defined coding tasks, the work has changed and you are responsible for a much larger area of work, and expectations are similarly higher. You're supposed to produce 10x code now.
loading story #47400210
I’d love to see what you’ve built. Can you share?
Maintenance is the hard part, not writing new code or steering and planning.
You can outsource that to another llm
If you care at code quality of course it is exhausting. It's supposed to be. Now there is more code for you to assure quality in the same length of time.
If you care about code quality you should be steering your LLM towards generating high quality code rather than writing just 'more code' though. What's exhausting is believing you care about high quality code, then assuming the only way to get high quality code from an LLM is to get it to write lots of low quality code that you have to fix yourself.

LLMs will do pretty much exactly what you tell them, and if you don't tell them something they'll make up something based on what they've been trained to do. If you have rules for what good code looks like, and those are a higher bar than 'just what's in the training data' then you need to build a clear context and write an unambiguous prompt that gets you what you want. That's a lot of work once to build a good agent or skill, but then the output will be much better.

loading story #47396253
loading story #47397018
loading story #47396925
Theory of Bounded Rationality applies. Tech tools scale systemic capability limits. 3 inch chimp brain limits dont change. The story writes itself.
loading story #47398764
loading story #47400525
You used to be a Formula 1 driver. Now you are an instructor for a Formula 1 autopilot. You have to watch it at all times with full attention for it's a fast and reckless driver.
You're being generous to the humans; we're more like Ladas in comparison.
That may not be a bad comparison. A F1 car is really fast, really specialized car, that is also extremely fragile. A Lada may not be too fast but its incredibly versatile and robust even after decades of use. And has more luggage space
I imagine code reviewing is a very different sort of skill than coding. When you vibe code (assuming you're reading teh code that is written for you) you become a coder reviewer... I suspect you're learning a new skill.
It’s easier to write code than read it.
Id argue the read-write procedures are happening simultaneously as one goes along, writing code by hand.
loading story #47396292
The way I've tried to deal with it is by forcing the LLM to write code that is clear, well-factored and easy to review i.e. continually forcing it to do the opposite of what it wants to do. I've had good outcomes but they're hard-won.

The result is that I could say that it was code that I myself approved of. I can't imagine a time when I wouldn't read all of it, when you just let them go the results are so awful. If you're letting them go and reviewing at the end, like a post-programming review phase, I don't even know if that's a skill that can be mastered while the LLMs are still this bad. Can you really master Where's Waldo? Everything's a mess, but you're just looking for the part of the mess that has the bug?

I'm not reviewing after I ask it to write some entire thing. I'm getting it to accomplish a minimal function, then layering features on top. If I don't understand where something is happening, or I see it's happening in too many places, I have to read the code in order to tell it how to refactor the code. I might have to write stubs in order to show it what I want to happen. The reading happens as the programming is happening.