Story Detail of id 47471539 | Liveview Hacker News

oofbey1 day ago | on: Tinybox – A powerful computer for deep learning

Meh. DGX is Arm and CUDA. Strix is X86 and ROCm. Cuda has better support than ROCm . And x86 has better support than Arm.

Nowadays I find most things work fine on Arm. Sometimes something needs to be built from source which is genuinely annoying. But moving from CUDA to ROCm is often more like a rewrite than a recompile.

overfeed21 hours ago | parent | next

> But moving from CUDA to ROCm is often more like a rewrite than a recompile.

Isn't everyone* in this segment just using PyTorch for training, or wrappers like Ollama/vllm/llama.cpp for inference? None have a strict dependency on Cuda. PyTorch's AMD backend is solid (for supported platforms, and Strix Halo is supported).

* enthusiasts whose budget is in the $5k range. If you're vendor-locked to CUDA, Mac Mini and Strix Halo are immediately ruled out.

oofbey5 hours ago | root | parent

Most everything starts as PyTorch. (Or maybe Jax.) But the inference engines all use hand tuned CUDA kernels - at least the good ones do. You have to do that to optimize things.

overfeed1 hour ago | root | parent

I'm certain inference engines don't use hand-tuned CUDA on Radeon or Mac Mini chips. My statement holds: those engines have no strict dependency on CUDA, or they'd be Nvidia-only.

BobbyJo23 hours ago | parent

CUDA != Driver support. Driver support seems to be what's spotty with DGX, and iirc Nvidia jas only committed to updates for 2 years or something.

#visit	13,229,483
#session	74,665
#live-session	0