Yup, precisely. Turns out getting AI to be reliable at doing useful things is harder than we've all been led to believe by the dominant narratives.
https://www.normaltech.ai/p/new-paper-towards-a-science-of-a...
https://www.normaltech.ai/p/new-paper-towards-a-science-of-a...