Hacker News new | past | comments | ask | show | jobs | submit

Show HN: Llama 3.1 70B on a single RTX 3090 via NVMe-to-GPU bypassing the CPU

https://github.com/xaskasdf/ntransformer
loading story #47109998
loading story #47107549
loading story #47105824
loading story #47108151
loading story #47107188
loading story #47106326
loading story #47108485
loading story #47110592
loading story #47110365
loading story #47106523
loading story #47108644
loading story #47106010
loading story #47106827
loading story #47110191
loading story #47109848
loading story #47109513
loading story #47108929
loading story #47106222
loading story #47114156
loading story #47111515
loading story #47111788
loading story #47108898
loading story #47108350