Not many workloads are RAM bandwidth limited. Power and latency are much more common bottlenecks, and HBM loses on both of those.
Multicore workloads do tend to hit RAM bandwidth limits before they hit power constraints. If you do the math, running at max frequency and core utilization would usually imply you could only access a byte or so per core clock cycle. Perhaps a mere handful of bytes for the highest-performance systems with in-package RAM.
loading story #48261946