Hacker News new | past | comments | ask | show | jobs | submit

The deep learning boom caught almost everyone by surprise

https://www.understandingai.org/p/why-the-deep-learning-boom-caught
loading story #42064762
loading story #42060762
> “Pre-ImageNet, people did not believe in data,” Li said in a September interview at the Computer History Museum. “Everyone was working on completely different paradigms in AI with a tiny bit of data.”

That's baloney. The old ML adage "there's no data like more data" is as old as mankind itself.

loading story #42061818
loading story #42063019
loading story #42064875
loading story #42063076
loading story #42061987
Not really. This is referring back to the 80's. People weren't even doing 'ML'. And back then people were more focused on teasing out 'laws' in as few data points as possible. The focus was more on formulas and symbols, and finding relationships between individual data points. Not the broad patterns we take for granted today.
I would say using backpropagation to train multi-layer neural networks would qualify as ML and we were definitely doing that in 80's.
Just with tiny amounts of data.
Compared to today. We thought we used large amounts of data at the time.
"We thought we used large amounts of data at the time."

Really? Did it take at least an entire rack to store?

We didn't measure data size that way. At some point in the future someone would find this dialog, and think that we dont't have large amounts of data now, because we are not using entire solar systems for storage.
loading story #42065235
loading story #42063993
loading story #42058383
loading story #42058282
The deep learning boom caught deep-learning researchers by surprise because deep-learning researchers don't understand their craft well enough to predict essential properties of their creations.

A model is grown, not crafted like a computer program, which makes it hard to predict. (More precisely, a big growth phase follows the crafting phase.)

I was a deep learning researcher. The problem is that accuracy (+ related metrics) were prioritized in research and funding. Factors like interpretability, extrapolation, efficiency, or consistency were not prioritized, but were clearly important before being implemented.

Dall-E was the only big surprising consumer model-- 2022 saw a sudden huge leap from "txt2img is kind of funny" to "txt2img is actually interesting". I would have assumed such a thing could only come in 2030 or earlier. But deep learning is full of counterintuitive results (like the NFL theorem not mattering, or ReLU being better than sigmoid).

But in hindsight, it was naive to think "this does not work yet" would get in the way of the products being sold and monetized.

loading story #42065828
loading story #42057746
loading story #42067859
loading story #42059833
loading story #42064233
loading story #42065897
loading story #42064067
loading story #42061283
loading story #42057939
loading story #42058063
loading story #42057339