Story Detail of id 42769515 | Liveview Hacker News

There's the distilled R1 GGUFs for Llama 8B, Qwen 1.5B, 7B, 14B, and I'm still uploading Llama 70B and Qwen 32B.

Also I uploaded a 2bit quant for the large MoE (200GB in disk size) to https://huggingface.co/unsloth/DeepSeek-R1-GGUF

Thank you. Which is currently the most capable version running reasonably fast on a 3090 (24GB of VRAM)?

loading story #42770790