Hacker News new | past | comments | ask | show | jobs | submit
This is a helpful method for visually grounding LLMs to take actions on the screen such as clicking. For humans though, hell no.