Story Detail of id 47932015 | Liveview Hacker News

Sol-2 hours ago | on: Talkie: a 13B vintage language model from 1930

Isn't it surprising that there were enough pre-1930 tokens to train an intelligent model? I was always under the impression that many tokens are also necessary to force the model to grok things and compress its learning into a somewhat intelligent model of the world, so to say. But perhaps I'm underestimating how much digitized literature exists from then.

#visit	13,267,774
#session	74,665
#live-session	0