Hacker News new | past | comments | ask | show | jobs | submit
> There would need to be automated pull-request checks verifying not only that tests pass but that code conforms to the spec.

As I understand, this is an unsolved problem.

Step 1: solve the halting problem.
Yep, calling it an "unsolved" problem is a misnomer. We already have mathematical proof that it's impossible.

But that aside, it's such a shame that many drinking the AI Kool-Aid aren't even aware of the theoretical limits of a computer's capabilities.

This sort of theoretical result is not always as clear-cut as you suggest.

Computers are finite machines. There is a theorem that although a machine with finite memory can add, multiplication requires unbounded memory. Somehow we muddle along and use computers for multiplication anyway.

More to your point there is a whole field of people who write useful programs using languages in which every program must be accompanied by a proof that it halts on all inputs.

(See for example https://lean-lang.org/ or David Turner's work on Total Functional Programming from about 20 years ago.)

Other examples are easy to find. The simplex algorithm for linear optimization requires exponential time in general, and the problem it solves is NP-hard, but in practice works well on problems of interest and is widely used. Or consider the dynamic programming algorithms for problems like subset-sum.

Theory is important, but engineering is also important.

loading story #48249548
loading story #48249202
You're probably thinking of Rice's theorem (a sort of generalised halting problem), but this task is actually way easier than that, since we're not trying to study arbitrary algorithms: we're trying to study the subset of human-comprehensible algorithms. Most of the things we want computers to do are things that, given enough time, someone can write a program to solve: and generally, those programs are not tricksy or meta, and don't involve unsolved mathematics problems found by studying busy-beaver candidates. If it's possible for a human to understand how a program works (which is a necessary component of writing such a program), it's possible to write a mathematical proof about the program's behaviour, and that means it's in principle possible to automate the construction of that mathematical proof, which is equivalent to the "determine whether code conforms to the spec" task.

"Somewhat easier than an impossible task" is not a particularly strong claim about when (or whether) this problem will be solved, though.

We can't create an algorithm determining whether a computer program halts or not, but we can write one that checks whether it conforms to natural language specifications much more easily? That makes no sense. There's no exception to the halting problem regarding "human comprehensible" computer programs.
loading story #48250029
loading story #48250188
I will give you a class of programs humans wrote and they want improved: LLMs.

Those were written by humans, and don't involve unsolved mathematics.

Is your claim tht you just need to solve comprehensibility of LLMs?

Figuring out epistemology and cognition to have a chance to reason about the outputs of a LLM seems to me way harder that traditional attempts to reason directly about algorithms.

this is actually precisely what humans' roles will be.

"is this implementation/code actually aligned with what i want to do?"

humanic responsibility's focus will move entirely from implementing code to deciding whether it should be implemented or not.

u probably mean unsolved as in "not yet able to be automated", and that's true.

if pull-request checks verifying that tests are conforming to the spec are automated, then we'd have AGI.

loading story #48248946
loading story #48248442
loading story #48247920