Hacker News new | past | comments | ask | show | jobs | submit
To my experience, AIs can generate perfectly good code relatively easy things, the kind you might as well copy&paste from stackoverflow, and they'll very confidently generate subtly wrong code for anything that's non-trivial for an experienced programmer to write. How do people deal with this? I simply don't understand the value proposition. Does Google now have 25% subtly wrong code? Or do they have 25% trivial code? Or do all their engineers babysit the AI and bugfix the subtly wrong code? Or are all their engineers so junior that an AI is such a substantial help?

Like, isn't this announcement a terrible indictment of how inexperienced their engineers are, or how trivial the problems they solve are, or both?

> the kind you might as well copy&paste from stackoverflow

This bothers me. I completely understand the conversational aspect - "what approach might work for this?", "how could we reduce the crud in this function?" - it worked a lot for me last year when I tried learning C.

But the vast majority of AI use that I see is...not that. It's just glorified, very expensive search. We are willing to burn far, far more fuel than necessary because we've decided we can't be bothered with traditional search.

A lot of enterprise software is poorly cobbled together using stackoverflow gathered code as it is. It's part of the reason why MS Teams makes your laptop run so hot. We've decided that power-inefficient software is the best approach. Now we want to amplify that effect by burning more fuel to get the same answers, but from an LLM.

It's frustrating. It should be snowing where I am now, but it's not. Because we want to frivolously chase false convenience and burn gallons and gallons of fuel to do it. LLM usage is a part of that.

What I can't wrap my head around is that making good, efficient software doesn't (by and large) take significantly longer than making bloated, inefficient enterprise spaghetti. The problem is finding people to do it with who care enough to think rigorously about what they're going to do before they start doing it. There's this bizarre misconception popular among bigtech managers that there's some tunable tradeoff between quality and development speed. But it doesn't actually work that way at all. I can't even count anymore how many times I've had to explain how taking this or that locally optimal shortcut will make it take longer overall to complete the project.

In other words, it's a skill issue. LLMs can only make this worse. Hiring unskilled programmers and giving them a machine for generating garbage isn't the way. Instead, train them, and reject low quality work.

> What I can't wrap my head around is that making good, efficient software doesn't (by and large) take significantly longer than making bloated, inefficient enterprise spaghetti. The problem is finding people to do it with who care enough to think rigorously about what they're going to do before they start doing it.

I don't think finding such programmers is really difficult. What is difficult is finding such people if you expect them to be docile to incompetent managers and other incompetent people involved in the project who, for example, got their position not by merit and competence, but by playing political games.

"What I can't wrap my head around is that making good, efficient software doesn't (by and large) take significantly longer than making bloated, inefficient enterprise spaghetti."

In my opinion the reason we get enterprise spaghetti is largely due to requirement issues and scope creep. It's nearly impossible to create a streamlined system without knowing what it should look like. And once the system gets to a certain size, it's impossible to get business buy-in to rearchitect or refactor to the degree that is necessary. Plus the full requirements are usually poorly documented and long forgotten by that time.

loading story #42002364
It's a market for lemons.

Without redoing their work or finding a way to have deep trust (which is possible, but uncommon at a bigcorp) it's hard enough to tell who is earnest and who is faking it (or buying their own baloney) when it comes to propositions like "investing in this piece of tech debt will pay off big time"

As a result, if managers tend to believe such plans, bad ideas drive out good and you end up investing in a tech debt proposal that just wastes time. Burned managers therefore cope by undervaluing any such proposals and preferring the crappy car that at least you know is crappy over the car that allegedly has a brand new 0 mile motor on it but you have no way of distinguishing from a car with a rolled back odometer. They take the locally optimal path because it's the best they can do.

It's taken me 15 years of working in the field and thinking about this to figure it out.

The only way out is an organization where everyone is trusted and competent and is worthy of trust, which again, hard to do at most random bigcorps.

This is my current theory anyway. It's sad, but I think it kind of makes sense.

loading story #42009185
Agreed.

The way I explain this to managers is that software development is unlike most work. If I'm making widgets and I fuck up, that widget goes out the door never to be seen again. But in software, today's outputs are tomorrow's raw materials. You can trade quality for speed in the very short term at the cost of future productivity, so you're really trading speed for speed.

I should add, though, that one can do the rigorous thinking before or after the doing, and ideally one should do both. That was the key insight behind Martin Fowler's "Refactoring: Improving the Design of Existing Code". Think up front if you can, but the best designs are based on the most information, and there's a lot of information that is not available until later in a project. So you'll want to think as information comes in and adjust designs as you go.

That's something an LLM absolutely can't do, because it doesn't have access to that flow of information and it can't think about where the system should be going.

> the best designs are based on the most information, and there's a lot of information that is not available until later in a project

This is an important point. I don't remember where I read it, but someone said something similar about taking a loss on your first few customers as an early stage startup--basically, the idea is you're buying information about how well or poorly your product meets a need.

Where it goes wrong is if you choose not to act on that information.

For sure. Or, worse, choose to run a company in such a way that anybody making choices is insulated from that information.
It's relatively easy to find a programmer(s) who can realize enterprise project X, it's hard to find a programmer(s) who cares about X. Throwing an increased requirement like speed at it makes this worse because it usually ends up burning out both ends of the equation.
> The problem is finding people to do it with who care enough to think rigorously

> ...

> train them, and reject low quality work.

I agree very strongly with both of these points.

But I've observed a truth about each of them over the last decade-plus of building software.

1) very few people approach the field of software engineering with anything remotely resembling rigor, and

2) there is often little incentive to train juniors and reject subpar output (move fast and break things, etc.)

I don't know where this takes us as an industry? But I feel your comment on a deep level.

loading story #42002501
loading story #42002395
loading story #42002468
<< Instead, train them, and reject low quality work.

Ahh, well, in order to save money, training is done via an online class with multiple choice questions, or, if your company is like mine and really committed to making sure that you know they take your training seriously, they put portions of a generic book on 'tech Z' in pdf spread spread over a drm ridden web pages.

As for code, that is reviewed, commented and rejected by llms as well. It is used to be turtles. Now it truly is llms all the way down.

That said, in a sane world, this is what should be happening for a company that actually wants to get good results over time .

> The problem is finding people to do it with who care enough to think rigorously about what they're going to do before they start doing it.

There is no incentive to do it. I worked that way, focused on quality and testing and none of my changes blew up in production. My manager opined that this approach is too slow and that it was ok to have minor breakages as long as they are fixed soon. When things break though, it's blame game all around. Loads of hypocrisy.

"Slow is smooth and smooth is fast"
loading story #42002642
we've decided we can't be bothered with traditional search

Traditional search (at least on the web) is dying. The entire edifice is drowning under a rapidly rising tide of spam and scam sites. No one, including Google, knows what to do about it so we're punting on the whole project and hoping AI will swoop in like deus ex machina and save the day.

Maybe it is naive but I think search would probably work again if they could roll back code to 10 or 15 years ago and just make search engines look for text in webpages.

Google wasn’t crushed by spam, they decided to stop doing text search and build search bubbles that are user specific, location-specific, decided to surface pages that mention search terms in metadata instead of in text users might read, etc. Oh yeah, and about a decade before LLMs were actually usable, they started to sabotage simple substring searches and kind of force this more conversational interface. That’s when simple search terms stopped working very well, and you had to instead ask yourself “hmm how would a very old person or a small child phrase this question for a magic oracle”

This is how we get stuff like: Did you mean “when did Shakespeare die near my location”? If anyone at google cared more about quality than printing money, that thirsty gambit would at least be at the bottom of the page instead of the top.

loading story #42001573
loading story #42001392
loading story #42002208
Google results are not polluted with spam because Google doesn't know how to deal with it.

Google results are polluted with spam because it is more profitable for Google. This is a conscious decision they made five years ago.

because it is more profitable for Google

Then why are DuckDuckGo results also (arguably even more so) polluted with spam/scam sites? I doubt DDG is making any profit from those sites since Google essentially owns the display ad business.

Ddg is actually Bing. Search as a service.
If you own the largest ad network that spam sites use and own the traffic firehose, pointing the hose at the spam sites and ensuring people spend more time clicking multiple results that point to ad-filled sites will make you more money.

Google not only has multiple monopolies, but a cut and dry perverse incentive to produce lower quality results to make the whole session longer instead of short and effective.

I personally think a big problem with search is major search engines try to be all things to all people and hence suffer as a result.

For example: a beginner developer is possibly better served by some SEO-heavy tutorial blog post; an experienced developer would prefer results weighted towards the official docs, the project’s bug tracker and mailing list, etc. But since less technical and non-technical people vastly outnumber highly technical people, Google and Bing end up focusing on the needs of the former, at the cost of making search worse for the later.

One positive about AI: if an AI is doing the search, it likely wants the more advanced material not the more beginner-focused one. It can take more advanced material and simplify it for the benefit of less experienced users. It is (I suspect) less likely to make mistakes if you ask it to simplify the more advanced material than if you just gave it more beginner-oriented material instead. So if AI starts to replace humans as the main clients of search, that may reverse some of the pressure to “dumb it down”.

loading story #42001301
loading story #42001614
> Traditional search (at least on the web) is dying.

That's not my experience at all. While there are scammy sites, using the search engines as an index instead of an oracle still yields useful results. It only requires to learn the keywords which you can do by reading the relevant materials .

How do you read the relevant materials if you haven’t found them yet? It’s a chicken and egg problem. If your goal is to become an expert in a subject but you’re currently a novice, search can’t help you if it’s only giving you terrible results until you “crack the code.”
AI will make the problem of low quality, fake, fraudulent and arbitrage content way worse. I highly doubt it will improve searching for quality content at all.
But it can't save the day.

The problem with Google search is that it indexes all the web, and there's (as you say) a rising tide of scam and spam sites.

The problem with AI is that it scoops up all the web as training data, and there's a rising tide of scam and spam sites.

There's no way the search AI will beat out the spamgen AI.

Tailoring/retraining the main search AI will be so much more expensive that retraining the spam special purpose AIs.

{"deleted":true,"id":42001698,"parent":42000246,"time":1730330751,"type":"comment"}
Without a usable web search index, AI will be in trouble eventually as well. There is no substitute for it.
>The entire edifice is drowning under a rapidly rising tide of spam and scam sites.

You make this claim with such confidence, but what is it based on?

There have always been hordes of spam and scam websites. Can you point to anything that actually indicates that the ratio is now getting worse?

There have always been hordes of spam and scam websites. Can you point to anything that actually indicates that the ratio is now getting worse?

No, there haven't always been hordes of spam and scam websites. I remember the web of the 90s. When Google first arrived on the scene every site on the results page was a real site, not a spam/scam site.

That was PageRank flexing its capability. There were lots of sites with reams of honeypot text that caught the other search engines.
Google could fix the problem if they wanted to, but it’s not in their interests to fix it since the spam sites generally buy ads from Google and/or display Google ads on their spam websites. Google wants to maximize their income, so..
>> No one, including Google, knows what to do about it

I'm sure they can. But they have no incentive. Try to Google an item, and it will show you a perfect match of sponsored ads and some other not-so-relevant non-sponsored results

AI will generate even more spam and scam sites more trivially.
loading story #42002818
It took the scam/spam sites a few years to catch up to Google search. Just wait a bit, equilibrium will return.
If only google was trying to solve search rather than shareholdet values.
Kagi has fixed traditional search for me.
Narrator: it did not, in fact, save the day.
loading story #42002267
loading story #42002894
loading story #42002397
loading story #42002631
loading story #42001525
loading story #42001631
loading story #42002172
I don't get it either. People will say all sorts of strange stuff about how it writes the code for them or whatever, but even using the new Claude 3.5 Sonnet or whatever variant of GPT4, the moment I ask it anything that isn't the most basic done-to-death boilerplate, it generates stuff that's wrong, and often subtly wrong. If you're not at least pretty knowledgeable about exactly what it's generating, you'll be stuck trying to troubleshoot bad code, and if you are it's often about as quick to just write it yourself. It's especially bad if you get away from Python, and try to make it do anything else. SQL especially, for whatever reason, I've seen all of the major players generate either stuff that's just junk or will cause problems (things that your run of the mill DBA will catch).

Honestly, I think it will become a better Intellisense but not much more. I'm a little excited because there's going to be so many people buying into this, generating so much bad code/bad architecture/etc. that will inevitably need someone to fix after the hype dies down and the rug is pulled, that I think there will continue to be employment opportunities.

Supermaven is an incredible intellisense. Most code IS trivial and I barely write trivial code anymore. My imports appear instantly, with high accuracy. I have lots of embedded SQL queries and it’s able to guess the structure of my database very accurately. As I’m writing a query the suggested joins are accurate probably 80% of the time. I’m significantly more productive and having to type much less. If this is as good as it ever gets I’m quite happy. I rarely use AI for non trivial code, but non trivial code is what I want to work on…
This is all about the tooling most companies choose when building software: Things with more than enough boilerplate most code is trivial. We can build tools that have far less triviality and more density, where the distance between the code we write and business logic is very narrow.. but then every line of code we write is hard, because it's meaningful, and that feels bad enough to many developers, so we end up with tools where we might not be more productive, but we might feel productive, even though most of that apparent productivity is trivially generated.

We also have the ceremonial layers of certain forms of corporate architecture, where nothing actually happens, but the steps must exist to match the holy box, box cylinder architecture. Ceremonial input massaging here, ceremonial data transformation over there, duplicated error checking... if it's easy for the LLM to do, maybe we shouldn't be doing it everywhere in the first place.

loading story #42001125
loading story #42003234
I don't think that is the signal that I think most people are hoping for here.

When I hear that most code is trivial, I think of this as a language design or a framework related issue making things harder than they should be.

Throwing AI or generates at the problem just to claim that they fixed it is just frustrating.

> When I hear that most code is trivial, I think of this as a language design or a framework related issue making things harder than they should be.

This was one of my thoughts too. If the pain of using bad frameworks and clunky languages can be mitigated by AI, it seems like the popular but ugly/verbose languages will win out since there's almost no point to better designed languages/framework. I would rather a good language/framework/etc where it is just as easy to just write the code directly. Similar time in implementation to a LLM prompt, but more deterministic.

If people don't feel the pain of AI slop why move to greener pastures? It almost encourages things to not improve at the code level.

I'm writing software independently, with an extremely barebones framework (just handles routing pretty much) and very lean architecture. Maybe I should re-phrase it, "a lot of characters in the code base are trivial". Imports, function declarations, variable declarations. Is this stuff code/logic? Barely, but it's completely unavoidable. It all takes time and it's now time I rarely have to spend.

Just as an example, I have "service" functions. They're incredibly simple, a higher order function where I can inject the DB handler, user permissions, config, etc. Every time I write one of these I have to import the ServiceDependencies type and declare which dependencies I need to write the service. I now spend close to zero time doing that and all my time focusing on the service logic. I don't see a downside to this.

Most of my business logic is done in raw SQL, which can be complex, but the autocomplete often helps there too. It's not helping me figure out the logic, it's simply cutting down on my typing. I don't know how anyone could be offered "do you want to have type significantly less characters on your keyboard to get the same thing done?" and say "no thanks". The AI is almost NEVER coding for me, it's just typing for me and it's awesome.

I don't care how lean your system is, there will at least be repetition in how you declare things. There will be imports, there will be dependencies. You can remove 90% of this repetitive work for almost no cost...

I've tried to use ChatGPT to "code for me", and I agree with you that it's not a good option if you're trying to do anything remotely complex and want to avoid bugs. I never do this. But integrated code suggestions (with Supermaven, NOT CoPilot) are incredibly beneficial and maybe you should just try it instead of trying to come up with theoretical arguments. I was also a non-believer once.

Well, Google did design Go...
loading story #42001042
loading story #42000334
Most programming is trivial. Lots of non-trivial programming tasks can be broken down into pure, trivial sections. Then, the non-trivial part becomes knowing how the entire system fits together.

I've been using LLMs for about a month now. It's a nice productivity gain. You do have to read generated code and understand it. Another useful strategy is pasting a buggy function and ask for revisions.

I think most programmers who claim that LLMs aren't useful are reacting emotionally. They don't want LLMs to be useful because, in their eyes, that would lower the status of programming. This is a silly insecurity: ultimately programmers are useful because they can think formally better than most people. For the forseeable future, there's going to be massive demand for that, and people who can do it will be high status.

>I think most programmers who claim that LLMs aren't useful are reacting emotionally.

I don't think that's true. Most programmers I speak to have been keen to try it out and reap some benefits.

The almost universal experience has been that it works for trivial problems, starts injecting mistakes for harder problems and goes completely off the rails for anything really difficult.

> I don't think that's true. Most programmers I speak to have been keen to try it out and reap some benefits.

I’ve been seeing the complete opposite. So it’s out there.

> Most programming is trivial

That's a bold statement, and incorrect, in my opinion.

At a junior level software development can be about churning out trivial code in a previously defined box. I don't think its fair to call that 'most programming'.

loading story #42000960
loading story #42002768
loading story #42001687
From my perspective, writing out the requirements for an AI to produce the code I want is just as easy as writing it myself. There are some types of boilerplate code that I can see being useful to produce with an LLM, but I don't write them often enough to warrant actually setting up the workflow.

Even with the debugging example, if I just read what I wrote I'll find the bug because I understand the language. For more complex bugs, I'd have to feed the LLM a large fraction of my codebase and at that point we're exceeding the level of understanding these things can have.

I would be pretty happy to see an AI that can do effective code reviews, but until that point I probably won't bother.

It's reasonable to say that LLMs are not completely useless. There is also a very valid case to make that LLMs are not good at generating production ready code. I have found asking LLMs to make me Nix flakes to be a very nice way to make use of Nix without learning the Nix language.

As an example of not being production ready: I recently tried to use ChatGPT-4 to provide me with a script to manage my gmail labels. The APIs for these are all online, I didn't want to read them. ChatGPT-4 gave me a workable PoC that was extremely slow because it was using inefficient APIs. It then lied to me about better APIs existing and I realized that when reading the docs. The "vibes" outcome of this is that it can produce working slop code. For the curious I discuss this in more specific detail at: https://er4hn.info/blog/2024.10.26-gmail-labels/#using-ai-to...

loading story #42001332
> I think most programmers who claim that LLMs aren't useful are reacting emotionally. They don't want LLMs to be useful because, in their eyes, that would lower the status of programming.

I think revealing the domain each programmer works in and asking in hose domains would reveal obvious trends. I imagine if you work in Web that you'll get workable enough AI gen code, but something like High Performance computing would get slop worse than copying and lasting the first result on Stackoverflow.

A model is only as good as its learning set, and not all types are code are readily able to be indexable.

> Lots of non-trivial programming tasks can be broken down into pure, trivial sections. Then, the non-trivial part becomes knowing how the entire system fits together.

I think that’s exactly right. I used to have to create the puzzle pieces and then fit them together. Now, a lot of the time something else makes the piece and I’m just doing the fitting together part. Whether there will come a day when we just need to describe the completed puzzle remains to be seen.

Trivial is fine but as you compound all the triviality the system starts to have a difficult time with putting it together. I don't expect it to nail it but then you have to unwind everything and figure out the issues so it isn't all gravy - fair bit of debug.
It’s always harder to build a mental model of the code written by someone else. No matter what, if you trust an LLM on small things in the long run you’ll trust it for bigger things. And the most code the LLM writes, the harder it is to build this mental construct. In the end it’ll be « it worked on 90% of cases so we trust it ». And who will debug 300 millions of code written by a machine that no one read based on trust ?
They are useful, but so far, I haven't seen LLMs being obviously more useful than stackoverflow. It might generate code closer to what I need than what I find already coded, but it also produces buggier code. Sometimes it will show me a function I wasn't aware of or approach I wouldn't have considered, but I have to balance that with all the other attempts that didn't produce something useful.
Yes. Productivity tools make programmer time more valuable, not less. This is basic economics. You’re now able to generate more value per hour than before.

(Or if you’re being paid to waste time, maybe consider coding in assembly?)

So don’t be afraid. Learn to use the tools. They’re not magic, so stop expecting that. It’s like anything else, good at some things and not others.

A good farmer isn’t likely to complain about getting a new tractor. But it might put a few horses out of work.
I would add that a lot of the time when I'm programming, I'm an expert on the problem domain but not the solution domain — that is, I know exactly what the pseudocode to solve my problem should look like; but I'm not necessarily fluent in the particular language and libraries/APIs I happen to have to use, in the particular codebase I'm working on, to operationalize that pseudocode.

LLMs are great at translating already-rigorously-thought-out pseudocode requirements, into a specific (non-esoteric) programming language, with calls to (popular) libraries/APIs of that language. They might make little mistakes — but so can human developers. If you're good at catching little mistakes, then this can still be faster!

For a concrete example of what I mean:

I hardly ever code in JavaScript; I'm mostly a backend developer. But sometimes I want to quickly fix a problem with our frontend that's preventing end-to-end testing; or I want to add a proof-of-concept frontend half to a new backend feature, to demonstrate to the frontend devs by example the way the frontend should be using the new API endpoint.

Now, I can sit down with a JS syntax + browser-DOM API cheat-sheet, and probably, eventually write correct code that doesn't accidentally e.g. incorrectly reject reject zero or empty strings because they're "false-y", or incorrectly interpolate the literal string "null" into a template string, or incorrectly try to call Element.setAttribute with a boolean true instead of an empty string (or any of JS's other thousand warts.) And I can do that because I have written some JS, and have been bitten by those things, just enough times now to recognize those JS code smells when I see them when reviewing code.

But just because I can recognize bad JS code, doesn't mean that I can instantly conjure to mind whole blocks of JS code that do everything right and avoid all those pitfalls. I know "the right way" exists, and I've probably even used it before, and I would know it if I saw it... but it's not "on the tip of my tongue" like it would be for languages I'm more familiar with. I'd probably need to look it up, or check-and-test in a REPL, or look at some other code in the codebase to verify how it's done.

With an LLM, though, I can just tell it the pseudocode (or equivalent code in a language I know better), get an initial attempt at the JS version of it out, immediately see whether it passes the "sniff test"; and if it doesn't, iterate just by pointing out my concerns in plain English — which will either result in code updated to solve the problem, or an explanation of why my concern isn't relevant. (Which, in the latter case, is a learning opportunity — but one to follow up in non-LLM sources.)

The product of this iteration process is basically the same JS code I would have written myself — the same code I wanted to write myself, but didn't remember exactly "how it went." But I didn't have to spend any time dredging my memory for "how it went." The LLM handled that part.

I would liken this to the difference between asking someone who knows anatomy but only ever does sculpture, to draw (rather than sculpt) someone's face; vs sitting the sculptor in front of a professional illustrator (who also knows anatomy), and having the sculptor describe the person's face to the illustrator in anatomical terms, with the sketch being iteratively improved through conversation and observation. The illustrator won't perfectly understand the requirements of the sculptor immediately — but the illustrator is still a lot more fluent in the medium than the sculptor is; and both parties have all the required knowledge of the domain (anatomy) to communicate efficiently about the sculptor's vision. So it still goes faster!

> people who can do it will be high status

They don't have high status even today, imagine in a world where they will be seen as just reviewers for AI code...

> They don't have high status even today

Try putting on a dating website that you work at Google vs you work in agriculture and tell us which yielded more dates.

Does it matter? I imagine the tanned shirtless farmer would get more hits than the pasty million dollar salary Googler anyway. (no offense to Googleers).

With so many hits, it's about hitting all the checkmarks instead of minmaxing on one check.

loading story #42002974
loading story #42000378
loading story #42000265
loading story #41999959
loading story #42002328
loading story #42001410
loading story #42001466
loading story #42000375
loading story #42000630
loading story #42000566
loading story #42001746
loading story #42001347
loading story #42002218
> Or do they have 25% trivial code?

Surely yes.

I (not at Google) rarely use the LLM for anything more than two lines at a time, but it writes/autocompletes 25% of my code no problem.

I believe Google have character-level telemetry for measuring things like this, so they can easily count it in a way that can be called "writing 25% of the code".

Having plenty of "trivial code" isn't an indictment of the organisation. Every codebase has parts that are straightforward.

loading story #41999874
loading story #41999824
loading story #42002203
loading story #42000381
loading story #42001148
loading story #42003118
loading story #42000934
loading story #42002641
loading story #42001718
loading story #42002422
loading story #42001258
loading story #42003040
loading story #42002637
loading story #41999777
loading story #42002457
loading story #41999938
loading story #42001950
loading story #42000686
loading story #42002008
loading story #42000409
loading story #42001395
loading story #42000801
loading story #41999911
loading story #42001277
loading story #42000048
loading story #41999962
loading story #42001523
loading story #42001575
loading story #42000678
loading story #42001426
loading story #41999772
loading story #42000427
loading story #42000916
loading story #42013583
loading story #42003119
loading story #42001684
loading story #42002613