Does anyone have inside info on what these Huawai chips look like? I know Google has a Torus architecture unlike Nvidias fully connected one. Maybe it’s a similar architectural decision on the huawai chips that leads to bottlenecks in serving?
https://www.huawei.com/en/news/2026/3/mwc-superpod-ai
>For AI computing, the Atlas 950 SuperPoD, powered by UnifiedBus, integrates 64 NPUs per cabinet and can scale up to 8,192 NPUs, delivering superior performance for large-scale AI training and high-concurrency inference.