Hacker News new | past | comments | ask | show | jobs | submit
I think the only way you get to that kind of budget is by assuming that the models are like 5 or 10 times larger than most LLMs, and that you want to be able to do a lot of training runs simultaneously and quickly, AND build the power stations into the facilities at the same time. Maybe they are video or multimodal models that have text and image generation grounded in a ton of video data which eats a lot of VRAM.