Story Detail of id 48213436 | Liveview Hacker News

trostaft20 hours ago | on: An OpenAI model has disproved a central conjecture in discrete geometry

> A difficult part was constructing a chess board on which to play math (Lean). Now it's just pattern recognition and computation.

However, this was not verified in Lean. This was purely plain language in and out. I think, in many ways, this is a quite exciting demonstration of exactly the opposite of the point you're making. Verification comes in when you want to offload checking proofs to computers as well. As it stands, this proof was hand-verified by a group of mathematicians in the field.

vatsachak20 hours ago | parent | next

Yeah, but I wouldn't be surprised if they train the model on verification assisted by Lean.

trostaft19 hours ago | root | parent | next

Arguing similarly to how stockfish, the chess engine, trains I would not be surprised if this is more common in the future. I don't know if they use any proof verification tools during their reinforcement learning procedure right now, as far as I know they've been focusing more on COT based strategies (w/o Lean). But I'm hardly an LLM expert, I don't know.

vatsachak15 hours ago | root | parent

They most definitely threw in rl with formal verification somewhere between GPT 4 and now. The models are better at not hallucinating. I don't think their IMO team are only show ponies...

cma7 hours ago | root | parent

Same could be said for human mathematicians that learn from tools like Lean.

ComplexSystems19 hours ago | parent | next

That may be true for now, but it seems clear enough that letting the model use Lean in its internal reasoning process would be a great idea

trostaft19 hours ago | root | parent

That I'd agree with! I really need to get around to learning Lean myself. It might be interesting to try and formalize some missing theoretical pieces from my field (or likely start smaller).

NooneAtAll313 hours ago | parent | next

how would they calculate "probability of solving" without automated verification?

ken4713 hours ago | parent

> However, this was not verified in Lean.

This is the caliber of thinking in unimpaired AI bullishness.

#visit	13,280,164
#session	74,665
#live-session	0