This says more about benchmarks than R1, which I do believe is absolutely an impressive model.
For instance, in coding tasks, Sonnet 3.5 has benchmarked below other models for some time now, but there is fairly prevalent view that Sonnet 3.5 is still the best coding model.
loading story #42771945
loading story #42769967