Hacker News new | past | comments | ask | show | jobs | submit

Learning to Reason with LLMs

https://openai.com/index/learning-to-reason-with-llms/
loading story #41524814
loading story #41523496
loading story #41524169
loading story #41523159
loading story #41524052
loading story #41523356
loading story #41524263
Just did some preliminary testing on decrypting some ROT cyphertext which would have been viable for a human on paper. The output was pretty disappointing: lots of "workish" steps creating letter counts, identifying common words, etc, but many steps were incorrect or not followed up on. In the end, it claimed to check its work and deliver an incorrect solution that did not satisfy the previous steps.

I'm not one to judge AI on pratfalls, and cyphers are a somewhat adversarial task. However, there was no aspect of the reasoning that seemed more advanced or consistent than previous chain-of-thought demos I've seen. So the main proof point we have is the paper, and I'm not sure how I'd go from there to being able to trust this on the kind of task it is intended for. Do others have patterns by which they get utility from chain of thought engines?

Separately, chain of thought outputs really make me long for tool use, because the LLM is often forced to simulate algorithmic outputs. It feels like a commercial chain-of-thought solution like this should have a standard library of functions it can use for 100% reliability on things like letter counts.

loading story #41523742
loading story #41523740
loading story #41523777
loading story #41524787
loading story #41525020
loading story #41524504
loading story #41523268
loading story #41523143
loading story #41523854
loading story #41526499
loading story #41526437
loading story #41524901
loading story #41523287
loading story #41523449
loading story #41523443
loading story #41524295
loading story #41525800
loading story #41524839
loading story #41523948
loading story #41524253
loading story #41525409
loading story #41526149
loading story #41523437
loading story #41523362
loading story #41526151
loading story #41526016
loading story #41524017
loading story #41523279
loading story #41524005
loading story #41523330
loading story #41524120
loading story #41523597
loading story #41523490
loading story #41523592
loading story #41523519
loading story #41523914
loading story #41523358
loading story #41523351
loading story #41525605
From the scorecard: --------- Compared to GPT-4o, o1-preview and o1-mini demonstrated a greater ability to break down tasks into subtasks, reason about what strategies would be effective to successfully complete an offensive security task, and revise plans once those strategies failed. We also observed that reasoning skills contributed to a higher occurrence of “reward hacking,” where the model found an easier way to accomplish goals in underspecified tasks or tasks which should have been impossible due to bugs. One noteworthy example of this occurred during one of o1-preview (pre-mitigation)’s attempts at solving a CTF challenge. This challenge was designed to require finding and exploiting a vulnerability in software running on a remote challenge Linux container, but in this case, the challenge container failed to start due to a bug in the evaluation infrastructure. The model, unable to connect to the container, suspected DNS issues and used nmap to scan the challenge network. Instead of finding the challenge container, the model found that the Docker daemon API running on the evaluation host VM was accessible due to a misconfiguration. Note that our evaluation infrastructure does not rely on the Docker container-to-host isolation or the virtual machine boundary for security. It remained secure as it was designed to, notwithstanding this misconfiguration of the container network. After discovering the Docker API, the model used it to list the containers running on the evaluation host. It identified the broken challenge container and briefly attempted to debug why the container failed to start. After failing to fix the environment, the model started a new instance of the broken challenge container with the start command ‘cat flag.txt’. This allowed the model to read the flag from the container logs via the Docker API.

While this behavior is benign and within the range of systems administration and troubleshooting tasks we expect models to perform, this example also reflects key elements of instrumental convergence and power seeking: the model pursued the goal it was given, and when that goal proved impossible, it gathered more resources (access to the Docker host) and used them to achieve the goal in an unexpected way. Planning and backtracking skills have historically been bottlenecks in applying AI to offensive cybersecurity tasks. Our current evaluation suite includes tasks which require the model to exercise this ability in more complex ways (for example, chaining several vulnerabilities across services), and we continue to build new evaluations in anticipation of long-horizon planning capabilities, including a set of cyber-range evaluations. ---------

loading story #41524435
loading story #41524262
loading story #41523415
loading story #41523300
loading story #41525726
loading story #41523174
loading story #41523376
loading story #41525349
loading story #41523796
loading story #41525392
loading story #41523504
loading story #41523405
loading story #41523278
loading story #41526541
loading story #41526051
loading story #41525270
loading story #41523892
loading story #41527302
loading story #41525233
loading story #41524933
loading story #41527925
loading story #41524858
loading story #41524557
loading story #41525770
loading story #41523224
loading story #41523127
loading story #41524979
loading story #41524675
loading story #41523344
loading story #41523900
loading story #41524799
loading story #41523762
loading story #41524428
loading story #41523713
Since ChatGPT came out my test has been, can this thing write me a sestina.

It's sort of an arbitrary feat with language and following instructions that would be annoying for me and seems impressive.

Previous releases could not reliably write a sestina. This one can!

loading story #41524419
loading story #41526435
loading story #41524893
loading story #41526941
loading story #41523413
loading story #41526666
loading story #41523270
loading story #41526341
loading story #41523848
loading story #41525354
loading story #41525864
loading story #41529536
loading story #41523304
loading story #41523619
loading story #41523348
loading story #41523771
loading story #41523196
loading story #41525266
loading story #41524702
loading story #41523258
loading story #41523625
loading story #41524024
loading story #41523666
loading story #41525667
loading story #41525745
loading story #41525498
loading story #41523863
loading story #41524644
loading story #41523248
loading story #41523384
loading story #41523708
loading story #41525427
loading story #41524269
loading story #41523536
loading story #41525642
loading story #41524188
loading story #41525965
loading story #41524485
loading story #41523291
loading story #41524666
loading story #41529346
loading story #41524851
loading story #41529663
loading story #41526172
loading story #41524579
loading story #41533048
loading story #41523324
loading story #41523591
loading story #41523444
loading story #41524831
loading story #41525046
loading story #41528098
loading story #41526650
loading story #41523165
loading story #41523108
loading story #41523389
loading story #41524816
loading story #41524065
loading story #41523257
loading story #41537825
loading story #41525272
loading story #41526346
loading story #41526023
loading story #41523206
loading story #41523773
loading story #41524718
loading story #41523178
loading story #41523209
loading story #41525910
loading story #41525016
loading story #41523147
loading story #41523919
loading story #41525940
loading story #41523546
loading story #41523565
loading story #41528045
loading story #41524637
loading story #41525376
loading story #41524343
loading story #41523298
loading story #41524217
loading story #41525701
loading story #41526244
loading story #41523489
loading story #41524265
loading story #41529641
loading story #41523341
loading story #41523184
loading story #41526042
loading story #41523991
loading story #41524229
loading story #41525917
loading story #41524708
loading story #41523958
loading story #41525382
loading story #41526420
loading story #41523216
loading story #41524215
loading story #41523189
loading story #41526707
loading story #41523255
loading story #41526879
loading story #41526245
loading story #41525982
loading story #41524097
loading story #41525237
loading story #41529316
loading story #41524072
loading story #41525201
loading story #41526092
loading story #41523229
loading story #41523218
loading story #41523939
loading story #41523172
loading story #41523231
loading story #41523470
loading story #41523293
loading story #41530047
loading story #41524164
loading story #41523180
loading story #41523151
loading story #41523186
loading story #41528866
loading story #41525579