Hacker News new | past | comments | ask | show | jobs | submit
So use a block based cache and tune the block size to maximize the hit rate? This isn’t rocket science.
This seems misguided, you have to cache a prefix due to attention.