Hacker News new | past | comments | ask | show | jobs | submit

KVarN: Native vLLM backend for KV-cache quantization by Huawei

https://github.com/huawei-csl/KVarN
loading story #48400498
loading story #48400484
loading story #48401659