Story Detail of id 48212866 | Liveview Hacker News

vatsachak20 hours ago | on: An OpenAI model has disproved a central conjecture in discrete geometry

As I have stated before, AI will win a fields medal before it can manage a McDonald's

A difficult part was constructing a chess board on which to play math (Lean). Now it's just pattern recognition and computation.

LLMs are just the beginning, we'll see more specialized math AI resembling StockFish soon.

trostaft20 hours ago | parent | next

> A difficult part was constructing a chess board on which to play math (Lean). Now it's just pattern recognition and computation.

However, this was not verified in Lean. This was purely plain language in and out. I think, in many ways, this is a quite exciting demonstration of exactly the opposite of the point you're making. Verification comes in when you want to offload checking proofs to computers as well. As it stands, this proof was hand-verified by a group of mathematicians in the field.

vatsachak19 hours ago | root | parent | next

Yeah, but I wouldn't be surprised if they train the model on verification assisted by Lean.

loading story #48214152

loading story #48219971

ComplexSystems19 hours ago | root | parent | next

That may be true for now, but it seems clear enough that letting the model use Lean in its internal reasoning process would be a great idea

loading story #48214305

NooneAtAll313 hours ago | root | parent | next

how would they calculate "probability of solving" without automated verification?

ken4712 hours ago | root | parent

> However, this was not verified in Lean.

This is the caliber of thinking in unimpaired AI bullishness.

Terr_20 hours ago | parent | next

> manage a McDonald's

Dystopia vibes from the fictional "Manna" management system [0] used at a hamburger franchise, which involved a lot of "reverse centaur" automation.

> At any given moment Manna had a list of things that it needed to do. There were orders coming in from the cash registers, so Manna directed employees to prepare those meals. There were also toilets to be scrubbed on a regular basis, floors to mop, tables to wipe, sidewalks to sweep, buns to defrost, inventory to rotate, windows to wash and so on. Manna kept track of the hundreds of tasks that needed to get done, and assigned each task to an employee one at a time. [...]

> At the end of the shift Manna always said the same thing. “You are done for today. Thank you for your help.” Then you took off your headset and put it back on the rack to recharge. The first few minutes off the headset were always disorienting — there had been this voice in your head telling you exactly what to do in minute detail for six or eight hours. You had to turn your brain back on to get out of the restaurant.

[0] https://en.wikipedia.org/wiki/Manna_(novel)

tomjakubowski14 hours ago | root | parent | next

Amazing bit of trivia that the founder of HowStuffWorks.com was named Marshall Brain.

kmeisthax19 hours ago | root | parent

Casual reminder that the author's proposed solution to the labor-automation dystopia is to invent a second identity-verification dystopia. Also casual reminder that the author wanted the death penalty to anyone over the age of 65.

loading story #48215441

Lerc20 hours ago | parent | next

I disagree. It will be able to perform work deserving if a fields medal before it is capable of running a McDonalds. I think it will be running a McDonalds well before either of those things happen, and a fields medal long after both have happened.

edbaskerville20 hours ago | root | parent | next

I just visited a McDonald's for the first time in a while. The self-order kiosk UI is quite bad. I think this is evidence in favor of the idea that an incompetent AI will soon be incompetently running a McDonald's.

pocksuppet6 hours ago | root | parent | next

Recently I tried to order at a Subway (which has decent quality food outside of the USA). They have kiosks. The kiosk only responded to touch about 60% of the time and took two seconds to respond. The employee who could've easily taken my order was just standing there bored. The future is here and it sucks.

loading story #48220648

Silamoth20 hours ago | root | parent | next

Out of curiosity, what issue did you have with the McDonald’s self-order kiosk? I actually think McDonald’s has the best kiosk I’ve ever encountered. The little animation that plays when you add an item to your cart is a little annoying (but I think they’ve sped that up). But otherwise, it’s everything I’d want. It shows you all the items, tells you every ingredient, and lets you add or remove ingredients. I have a better experience ordering through the kiosk than I do talking to a cashier.

loading story #48213763

loading story #48217545

jldugger16 hours ago | root | parent | next

>The self-order kiosk UI is quite bad.

Most repeat customers use the app, which sports the digital equivalent of a loyalty program, and various coupons. And lets you save your 'usual' order with customizations etc. Plus the annoying push notifications for FreeFrydays or whatever. And upsells, new product launches, etc.

My recollection is that the kiosk is just a weak facsimile of the app. And wasn't terrible, but everyone's standards vary.

loading story #48217642

loading story #48220576

20 hours ago | root | parent

{"deleted":true,"id":48213570,"parent":48213330,"time":1779308568,"type":"comment"}

c7b19 hours ago | root | parent | next

One could hardly ask for a task better suited for LLMs than producing math in Lean. Running a restaurant is so much fuzzier, from the definition of what it even means to the relation of inputs to outputs and evaluating success.

moron4hire13 hours ago | root | parent

I think Lerc is saying that LLMs will be pressed into service managing McDonald's restaurants long before they are actually capable of managing said restaurants successfully.

vatsachak19 hours ago | root | parent

Not necessarily. Obviously playing Kasparov on the board requires more planning ability than managing a McDonald's but look at where chess bots are now.

There's much more to being human than our "cognitive abilities"

pamcake13 hours ago | root | parent | next

> Obviously playing Kasparov on the board requires more planning ability than managing a McDonald's

Not obvious and in fact I think the opposite is way more likely. Chess is well-defined and self-contained in a way that managing a restaurant with fleshy customers never will be.

loading story #48217163

baq19 hours ago | root | parent

Conjecture: the first AI to successfully manage a McDonald’s will be a Gemini.

loading story #48217885

jeremyjh4 hours ago | parent | next

Stockfish did not teach itself to play chess. You are probably thinking of Leela Chess Zero - an open re-implementation of AlphaZero - both were given nothing but the rules of chess and a board and played millions of games against themselves until they were the strongest engine available at the time.

Stockfish's neural net evaluation model was trained on millions of its positions with its own original algorithmic evaluation function (entirely developed by humans) and search tree. The result was a much smaller model than Leela's that requires little computation (not even a GPU), paired with its already extremely efficient search/pruning algorithms that made it stronger than Leela in competitive play. Leela's evaluation function is much stronger (at one ply it has an ELO of around 2300, Stockfish is probably closer to 1800), but it requires vastly more resources and those are always bounded in a match.

Humans haven't learned as much new information about chess from Stockfish as we have from Leela.

evenhash20 hours ago | parent | next

The proof is not written in Lean, though. It’s written in English and requires validation by human experts to confirm that it’s not gibberish.

vatsachak19 hours ago | root | parent

Yeah, but I wouldn't be surprised if they train the model on verification assisted by Lean

energy1239 hours ago | parent | next

The issue with this prediction is the gulf between problem-solving using known tools, versus creating new concepts for problems where existing tools aren't enough.

All AI proofs so far, including this one, are using existing tools in new ways, rather than inventing new tools. This is not surprising if you know how these models are trained. These existing tools are in distribution. New tools are not.

Problems worth of a Fields Medal likely require new tools to be invented. Thus it is not clear whether progress within the confines of the current paradigm is enough.

We could get this weird spiky situation where the AI is insanely superhuman at all problem solving, but completely incapable of coming up with a single new tool. It discovers everything there is to discover, subject to existing axioms and concepts.

Timothy Gowers gives some commentary on this in the attached PDF.

auggierose18 hours ago | parent | next

> A difficult part was constructing a chess board on which to play math

We have that chess board for quite a while now, over 40 years. And no, there is nothing special about Lean here, it is just herd mentality. Also, we don't know how much training with Lean helped this particular model.

KalMann19 hours ago | parent | next

I think your analogy is good but I don't believe modern LLMs use Lean or any lean-like structure in their proofs. At least recent open source ones like DeepSeek can do advanced math without it (maybe the most cutting edge ones are doing it I can't say).

vatsachak14 hours ago | root | parent

They are most likely using them in training. I doubt their IMO team are show ponies

forinti20 hours ago | parent | next

AI is already too old for that.

sigmoid1020 hours ago | parent | next

Managing a McDonalds is a question of integration and modalities at this point. I don't think anyone still doubts that these models lack the reasoning capability or world knowledge needed for the job. So it's less of a fundamental technical problem and more of a process engineering issue.

dap15 hours ago | root | parent | next

Have you not seen:

https://www.anthropic.com/research/project-vend-1 https://www.wsj.com/tech/ai/anthropic-claude-ai-vending-mach...

(Two different examples of a similar idea)

loading story #48218163

andy12_19 hours ago | root | parent | next

I disagree. Even frontier models still achieve way worse results than the human baseline in VendingBench. As long as models can't manage optimally something as simple as a vending machine, they have no hope of managing a McDonalds.

throw-the-towel20 hours ago | root | parent

The capability they lack is being able to be sued.

loading story #48213250

volkercraig19 hours ago | parent | next

> we'll see more specialized math AI resembling StockFish soon

Heuristically weighted directed graphs? Wow amazing I'm sure nobody has done that before.

vatsachak19 hours ago | root | parent

My claim is that LLMs waste a lot of time training on all available data.

Math is a sequence of formal rules applied to construct a proof tree. Therefore an AI trained on these rules could be far more efficient, and search far deeper into proof space

loading story #48215224

brikym13 hours ago | parent | next

Hey ChatGPT, if a person spills hot McCoffee on themselves who is at fault?

brookst13 hours ago | root | parent

Well, brikym, exactly how hot is this hot coffee? If it’s within normal expectations for coffee it is likely that person’s fault. If it is 210 degrees F, it is likely McDonald’s fault.

whimsicalism20 hours ago | parent | next

the only thing keeping the mcdonalds from happening will be political, likewise the same with fields medal

soupspaces20 hours ago | parent | next

Lee Sedol, Move 37 https://www.reddit.com/r/singularity/comments/1l0z5yk/the_mo... Edit: I wasn't necessarily disagreeing. But on second thought the chessboard in this math analogy is being built, not just played in. This Hardy quote comes to mind https://www.goodreads.com/quotes/902543-it-proof-by-contradi...

vatsachak19 hours ago | root | parent

My claim is that we haven't even witnessed the move 37 of math yet. I am claiming that math AI is going to get even better

segmondy20 hours ago | parent | next

our local AI models are already capable of running McDonalds.

hoppyhoppy26 hours ago | root | parent

Why aren't they doing so?

fapjacks13 hours ago | parent | next

I dunno. Is AI less than forty years old?

ori_b19 hours ago | parent | next

We're automating art and science so that we can flip burgers. This future sucks.

vatsachak19 hours ago | root | parent | next

Math is a very specialized subset of art and science more amenable to automation.

loading story #48215303

dyauspitr15 hours ago | root | parent

No, we’re not going to be flipping burgers either, they will have physical robots for that. 20 years down the line I wonder what work all of us will be doing.

dyauspitr19 hours ago | parent | next

Nonsense. Have you been watching the figure live stream? Or the Unitree video from yesterday with real time novel action generation? We’re less than a year away. If you can cook a burger, assemble a sandwich and clean up surfaces you’re all of the way there.

vatsachak19 hours ago | root | parent

Fair. Let's see in a year. I'm willing to bet that nothing happens.

loading story #48214326

huflungdung14 hours ago | parent | next

[dead]

20 hours ago | parent

{"deleted":true,"id":48213204,"parent":48212866,"time":1779306897,"type":"comment"}

#visit	13,278,621
#session	74,665
#live-session	0