Story Detail of id 48388459 | Liveview Hacker News

Unfortunately there's no gguf quants of the assistant model yet: https://huggingface.co/models?other=base_model:quantized:goo...

This has been my impression.

The underlying LiteRT-LM framework used in the edge gallery does support the MTP drafters for the smaller models, but according to:

> Note: LiteRT-LM supports E2B and E4B models today, with support for larger models coming soon.

So even Google aren't shipping MTP support for the 26B and 31B models yet.

[dead]