Hacker News new | past | comments | ask | show | jobs | submit
The second L in LLM stands for "language". Nothing of what you're describing has to do with language modeling.

They could be using transformers, sure. But plenty of transformers-based models are not LLMs.

They are probably looking for LGMs - Large Generative Models which encapsulate vision & multi-modal models.