> Moving forward, the industry cannot continue to train bigger and bigger models since their intelligence not only plateaus but often will get worse
These are wild claims - why are we concluding that bigger models and more data = more hallucination? That’s actually the opposite of what’s been happening over the last couple years. Some models may still hallucinate more but they all hallucinate much less than the original 175B ChatGPT which was smaller and trained on (much) less data than anything current.
Edit: My mention of data comes from this quote:
> A shift is happening among major AI labs, who are becoming increasingly skeptical of endless parameter count and training data scaling
My take on the current situation: it seems clear that the industry has seen that there is still a lot left to squeeze out of sub-1T models. But for that you do need more, high-quality data in the distribution which you want to unlock capabilities for.