Javascript is not enabled. This site can still works but it'll be more interactive when javascript is enabled.
loading...
Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
charlescurt123
4 months ago
|
on: Learning to Reason with LLMs
It's RL so that means it's going to be great on tasks they created for training but not so much on others.
Impressive but the problem with RL is that it requires knowledge of the future.
reply