Hacker News new | past | comments | ask | show | jobs | submit
What I find fascinating that there is so little substance in this article about the quality of produced code and the medium. Is the code documented and tested? Is it understandable and extendable? Is it secure? What language, framework, database was used? Author mentions judgement and taste - well, is the code tasteful? Will the model rearchitecture the entire thing if I ask it to add new functionality, spending another 9.5h in tokens? I assume that the research part is domain knowledge = how different types of travel translate to time making it presentable; how did the author verify this?

These questions are even not about AI: if I were to give money to a human agency and were given something they tell me works, I would ask the same questions. If I did not know how to evaluate, I would hire people that do. With LLMs the verification part is what bothers me the most.

These posts are never written by software engineers, it’s always some tech exec, retired engineer, or VC. This author is apparently a professor at the Wharton School of Management? None of these people have to ship or maintain real products, they’re just making side projects.

The only decent software engineering perspective I’ve seen has been from Mitchell Hashimoto.

Well that’s kind of the point.

They can just summon bespoke software out of the ether that only handles the use cases of themselves and a few of their collaborators.

Making “side projects” was mot possible for non-developers before powerful LLMs. Now it is.

I don’t think that’s true, I think these authors are making a much stronger claim that AI is proficient or even an expert at software engineering. This author describes how complex and sophisticated their software is, and the only value he’ll concede to “coders” is that there might be a few bugs they’d need to fix.

Imagine not being an architect and using Claude to put together a building plan, then concluding it’s basically done but we might need a real architect to double check the measurements. It may even be true but I’d be skeptical if it’s always non-architects saying this.

Why do they even need coders to fix these bugs? It would be an order of magnitude (at least) to ask Claude to find and fix them, and it will likely be successful.

Building in the physical world has physical and time constraints that cannot be overcome, which is one of the reasons architecture (and engineering) are so important in this domain. In software development these constraints were only inherent when people were writing the majority of the software. I feel like I’m seeing what I thought were fundamental constraints being eroded by the increasing speed and correctness of these tools and it’s making me reconsider the importance of some of the values that are held by software engineering.

It’s obviously dependent on the domain and solution, but if your software can be extremely rapidly rearranged, bugs found and fixed with little effort, and features added with only a minimum prompt, I think the entire definition of technical debt has changed. I’ve been sceptical of these tools and still approach their output with caution. I also worry that, as a software developer, if more can be accomplished in less time there will be less room on this planet for software developers.

It is, and it's cool that it is, but the calibration is important. Statements like this:

> With Fable the spell has gotten powerful enough that I am no longer sure I am the wizard. I am closer to a patron. I describe what I want, I pay for it, and I judge the result. The conjuring happens somewhere I cannot watch, in hundreds of small choices I never get a vote on. The work has shifted from process to outcome. I no longer steer; I commission.

have a very different meaning coming from a non-technical researcher than they would from someone who builds software for a living.

Making side projects isn't a trillion dollar industry tho, adding to the fact that we are facing another global supply chain crisis due to the Iran War; the US is about to commit the biggest self-own ever in the history of empire.
loading story #48471486
I’m starting to realize that LLMs are really good at building low-stakes projects. Your questions mostly presume that the stakes are higher. The software will last a long time; the requirements will evolve; we can’t tolerate mistakes; etc.

The trick to getting good at using LLMs for software is to learn how to make _all_ projects low-stakes.

You don't need LLM for that. You make _all_ projects low-stakes by working on green field project using (insert buzzword soup of the day) and leaving for a new green field opportunity (that requires experience with buzzword soup of the day) before the project ships.
No, what you’re describing still requires you to do some actual work, and also, while you work there, there is still some level of accountability. A much, much better grift is coaching.

Like, an AI coaching session for executives at the yearly executive retreat. You show up, spend a few hours going through some nonsense slides ChatGPT put together for you, you charge an eye watering fee for it, HR or whoever organizes it will gladly pay for it because it will make them look all cutting edge in front of the CEO, by the next day everyone will forget about it. No accountability at all!

If there's a viable way to make all projects low-stakes we'd have done it. Consider this: microservices.
This is really insightful, but I think it also extends to making the project either low stakes or low complexity. I have this lurking feeling that the preferable architecture for software will change as a result of LLMs because they're good at working on low complexity modular components more than they are on high complexity million-line code bases.
You'll just shift complexity to the orchestration of the modular components.

Monoliths vs micro-services.

> The trick to getting good at using LLMs for software is to learn how to make _all_ projects low-stakes.

this doesn't really work in the real world. There are many things that actually matter, engineering is fundamentally about handling them.

>What I find fascinating that there is so little substance in this article about the quality of produced code and the medium.

I clicked one of his examples intrigued "a snake game where the snake is self-aware and crazy things happen;". Played for 1-2 minutes, and it's the classic 1980s snake game. Am I missing something? What is "self-aware" about it? Some funny messages at the bottom of the screen? And what are the "crazy things"?

It sounds like you either didn't play enough or you are missing the new mechanics that get added over time. There's definitely more to it than just regular snake.
I had the exact same thought. To me, it feels like they just took the fairly common “sentient video game character” trope and bolted it onto a very conventional snake game.

I will say, the act of eating creates a "bulge distortion" that flows down the length of the snake is a nice touch though.

You didn't play long enough. There are layers and layers and layers of features in that game if you play for 10 minutes or more.
Can you spoil it for us?

    the quality of produced code and the medium
A thought I have been tossing around in my head as the models get better is that it really may not matter what the code looks like.

If the observed behavior of the software is good, then the software is good. If a bug, of whatever kind, can be fixed by a model on a vibe-coded codebase, then that's a fixable bug. If there are no exploitable vulnerabilities, then the code is secure. If the performance is adequate, then the code is performant.

It simply does not matter what the code looks like if, from the outside, it does what its supposed to, and, from the inside, a model can fix the issue if one is found.

More than ever, software engineering is now really a job about making sure the code is doing what its supposed to.

And even if it DOES matter what the code looks like, you can have a model fix that too.

The thing is that a lot of code rely on multiple layers of abstractions with their own correctness and failure states. And then you overlay the domain correctness and failure cases on top of that.

But all of those correctness are imaginary. The hardware only enforce a few (and it may be buggy). The OS adds some more (and it’s buggy). The compiler/interpreter may have bugs (but that’s rarely a nuisance) and the libraries are often brittle. There are cracks everywhere in the tower of abstractions.

The code has never mattered. What has always mattered is the knowledge of what is the model of correctness of the software (programming as a theory by NauR), so that you can discern where a program is wrong.

The thing is a crash or some other immediate errors are actually nice to have. You get to react immediately and can have a core dump or a stacktrace that points you the error. What is truly a terror is silent corruption (wrong order of operations, wrong values for a comparison that has expanded the idea of correctness, security issues that has been backdoored for years,…).

As Hoare said:

  There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies.
  The first method is far more difficult.
LLM are very much the second kind. You write a lot of complicated code, and then you can no longer reason about their correctness.
Welcome to every LLM discussion in the past 2 years or so. When asked for anything of substance, we're faced with a barrage of "but humans aren't good at this too!" Very few quantifiable evidence and lots of pure rhetoric.
I’ve seen this pattern again and again, and I don’t bother replying. There’s also the “strong statement, and when you contradict it, they point out some particular circumstances that no one cares about”.
Being the first to release an article gives you great SEO or whatever. Doing the things you've mentioned takes time.
Less fascinating when you consider that this is a non-coders perspective.
Fair enough, but enterpreunership should, I guess, ask questions if given Next Big Thing has substance behind it or is it just snake oil.
Ah, but billions of dollars depend on those questions not being asked in a genuine manner. Don't you want a slice of that or are you an... AI skeptic thunder clashes.
Yeah, this made it basically clickbait for me, in terms of time I wasted with the wrong expectation.

The lack of downvotes on posts on HN has always felt like more of a bug than a feature to me.

So, the perspective of the one that gains the most, that will value this the most, and that will pay the most? ;)
{"deleted":true,"id":48467036,"parent":48466463,"time":1781036108,"type":"comment"}
loading story #48471104
You probably don't care about the ingredients or engineering of asphalt, only if the road does its job well or is filled with potholes. Outside of the software industry, nobody gives a shit about code or databases.
I agree. But if I'm paying for the road (even as a taxpayer) I get angry that after a year it's full of potholes and that there are unnecessary signs warning about penguin crossing, making it cost 2 times more than it should have (and dont get me started why this road is really a highway leading to my house). I'd want certain qualities. And this article is basically = you will get a road, built quickly

But yes, you are right - I don't build roads and don't know what is a price to build a road and how to determine the quality of correctly built one, nor I will ever care or learn.

The ingredients and composition of the tarmac is the difference between having the road full of pot holes after a week of use
Sure, but if there's a trillion dollar company saying that it's going to replace all our road workers or engineers - I'd want to listen to the opinion of an expert. Some reporter from CNN driving over it like "yeah seems good to me, good this" has approximately zero persuasive power to me.
Does it matter to the people requesting the software if it acts in the way they expect?
We've lived in a software bubble for so long, most software engineers have completely forgotten that the purpose of (most) software is to solve a problem. If that problem solves the problem well and reliably it doesn't matter the quality of the code.

In fact, that's the entire reason we care about "quality code", because we assume that quality code is code that does what you expect well and consistently.

I say this as someone who hand writes code pretty much every night for fun, just to experiment with computation. Which, oddly, is more fun than ever because I don't feel like there's any need to connect this type of programming with "real world software", and I can really enjoy code for it's own sake, meanwhile my job is mostly just running agent loops (which I quite like as well).

I haven't forgotten that, I affirmatively think it's false. High quality code is necessary to solve problems reliably. Perhaps some people call things code quality when they don't matter (I really don't care what most variables are named), but there have always been teams who try to increase velocity by disregarding code quality, and from what I've seen AI does not stop them from shipping outages constantly.
True, but you should say that about every thing. Does it matter to you how the car drives, as long as it takes you to your destination? Well, yes, it matters: how will it deal with a crash, and if it's possible to replace a part and if anybody can just open it if you leave it outside. I will be amazed if somebody shows me their home-printed car, but if they'll try to sell it to me like a new one...
I'm becoming more convinced these are questions of the Before Times. Yes, yes—heresy, I know.

Yet, I can't deny the reality that I observe working with LLMs every day. If this truly is a step-function (as some are sgguesting), then I have absolutely zero concern for the quality of the code.

Kind of a circular argument, isn't it? "Some people are saying it's very good at coding. If that's true, I don't care if the code is good."