I think it's simply because we haven't found a better algorithm than backpropagation. We're stuck relying on massive datasets, running the numbers over and over, and working backward from errors to figure out how to fine-tune trillions of 'knobs.' Then, we have to do this at least once for every single token across the entire internet. Any tiny bit of computation, when multiplied by a base that massive, inevitably skyrockets into astronomical numbers.