Claude code gets >98% KV cache hits. It’s not reprocessing unless you let the cache go cold (5 minutes, which is annoyingly short).
I meant caching on a bigger level. If you're an organization with 100 developers each doing 10 sessions a day, you're paying for 10000x tokens in frequently used document even if you had 100% KV cache hits within one session. Apparently that's too costly even for companies with trillion dollar market cap...
Normally KV cache works only if your context prefix is identical, but there are papers which demonstrate documents can be cached between different contexts.
Ah, understood, and thanks for the clarification!
Are you sure that hitting the cache mean you’re not paying for those tokens?
You pay, at 10% the price (in quota or dollars) for non-cached. See https://platform.claude.com/docs/en/about-claude/pricing
loading story #48248351