Story Detail of id 48312742 | Liveview Hacker News

RL is more than facts. Synthetic feedback is an obvious approach. Does the model suggest code that compiles and performs well?