Story Detail of id 48216231 | Liveview Hacker News

dap16 hours ago | on: An OpenAI model has disproved a central conjecture in discrete geometry

Have you not seen:

https://www.anthropic.com/research/project-vend-1 https://www.wsj.com/tech/ai/anthropic-claude-ai-vending-mach...

(Two different examples of a similar idea)

Both links talk about the same thing? The first one just being more general. And yes, I would expect no less from a poorly constrained single agent that was instruction trained to be helpful and friendly. But if you look at how this has evolved as a benchmark [1] then the latest models show no doubt that can actually deal with this limited, simulated scenario given the correct setup.

[1] https://andonlabs.com/evals/vending-bench-2

#visit	13,280,324
#session	74,665
#live-session	0