Story Detail of id 47471270 | Liveview Hacker News

BobbyJo1 day ago | on: Tinybox – A powerful computer for deep learning

Internet seems to think the SW support for those is bad, and that strix halo boxes are better ROI.

Meh. DGX is Arm and CUDA. Strix is X86 and ROCm. Cuda has better support than ROCm . And x86 has better support than Arm.

Nowadays I find most things work fine on Arm. Sometimes something needs to be built from source which is genuinely annoying. But moving from CUDA to ROCm is often more like a rewrite than a recompile.

overfeed21 hours ago | root | parent | next

> But moving from CUDA to ROCm is often more like a rewrite than a recompile.

Isn't everyone* in this segment just using PyTorch for training, or wrappers like Ollama/vllm/llama.cpp for inference? None have a strict dependency on Cuda. PyTorch's AMD backend is solid (for supported platforms, and Strix Halo is supported).

* enthusiasts whose budget is in the $5k range. If you're vendor-locked to CUDA, Mac Mini and Strix Halo are immediately ruled out.

oofbey5 hours ago | root | parent

Most everything starts as PyTorch. (Or maybe Jax.) But the inference engines all use hand tuned CUDA kernels - at least the good ones do. You have to do that to optimize things.

overfeed1 hour ago | root | parent

I'm certain inference engines don't use hand-tuned CUDA on Radeon or Mac Mini chips. My statement holds: those engines have no strict dependency on CUDA, or they'd be Nvidia-only.

BobbyJo23 hours ago | root | parent

CUDA != Driver support. Driver support seems to be what's spotty with DGX, and iirc Nvidia jas only committed to updates for 2 years or something.

#visit	13,229,472
#session	74,665
#live-session	0