Story Detail of id 42061987 | Liveview Hacker News

littlestymaar18 hours ago | on: The deep learning boom caught almost everyone by surprise

In 2019, GPT-2 1.5B was trained on ~10B tokens.

Last week Hugging Face released SmolLM v2 1.7B trained on 11T tokens, 3 orders of magnitude more training data for the same number of tokens with almost the same architecture.

So even back in 2019 we can say we were working with a tiny amount of data compared to what is routine now.

loading story #42063083

#visit	10455770
#session	44660
#live-session	0