Learning to Reason with LLMs
https://openai.com/index/learning-to-reason-with-llms/I'm not one to judge AI on pratfalls, and cyphers are a somewhat adversarial task. However, there was no aspect of the reasoning that seemed more advanced or consistent than previous chain-of-thought demos I've seen. So the main proof point we have is the paper, and I'm not sure how I'd go from there to being able to trust this on the kind of task it is intended for. Do others have patterns by which they get utility from chain of thought engines?
Separately, chain of thought outputs really make me long for tool use, because the LLM is often forced to simulate algorithmic outputs. It feels like a commercial chain-of-thought solution like this should have a standard library of functions it can use for 100% reliability on things like letter counts.
While this behavior is benign and within the range of systems administration and troubleshooting tasks we expect models to perform, this example also reflects key elements of instrumental convergence and power seeking: the model pursued the goal it was given, and when that goal proved impossible, it gathered more resources (access to the Docker host) and used them to achieve the goal in an unexpected way. Planning and backtracking skills have historically been bottlenecks in applying AI to offensive cybersecurity tasks. Our current evaluation suite includes tasks which require the model to exercise this ability in more complex ways (for example, chaining several vulnerabilities across services), and we continue to build new evaluations in anticipation of long-horizon planning capabilities, including a set of cyber-range evaluations. ---------
It's sort of an arbitrary feat with language and following instructions that would be annoying for me and seems impressive.
Previous releases could not reliably write a sestina. This one can!