Hacker News new | past | comments | ask | show | jobs | submit
https://www.youtube.com/watch?v=GH9-EmgtABw

Saw this video recently, by an AI company working to get contextual cues from tone and body language. I think they're converting it to text and feeding it into a LLM, so not natively multimodal, but I still thought it was really cool.