Hacker News new | past | comments | ask | show | jobs | submit
>reasoning capabilities in latest models are rapidly approaching superhuman levels and continue to scale with compute

I still have a pretty hard time getting it to tell me how many sisters Alice has. I think this might be a bit optimistic.

They plugged the hole for "how many 'r''s in 'strawberry'", but I just asked it how many "l"s in "lemolade" (spelling intentional) and it told me 1. If you make it close to, but not exactly a word it would be expecting it falls over.
I wonder if those special cases are handled by a bunch of if/else statements wrapped around the model :)
loading story #42797889