Hacker News new | past | comments | ask | show | jobs | submit
Hopefully in the next decade we’ll get there.

Vision+language multimodal models seem to solve some of the hard problems.