Hacker News new | past | comments | ask | show | jobs | submit

Consistency LLM: converting LLMs to parallel decoders accelerates inference 3.5x

https://hao-ai-lab.github.io/blogs/cllm/
loading story #40303428
loading story #40303071
loading story #40305627
loading story #40302689
loading story #40302569
loading story #40303379
loading story #40303845
loading story #40305470
loading story #40303311
loading story #40302564
loading story #40303122
loading story #40307391
loading story #40303128
loading story #40303926
loading story #40302970
loading story #40303724
loading story #40303450
loading story #40302584
loading story #40303643