Story Detail of id 47203790 | Liveview Hacker News

blakec18 hours ago | on: MCP server that reduces Claude Code context consumption by 98%

The FTS5 index approach here is right, but I'd push further: pure BM25 underperforms on tool outputs because they're a mix of structured data (JSON, tables, config) and natural language (comments, error messages, docstrings). Keyword matching falls apart on the structured half.

I built a hybrid retriever for a similar problem, compressing a 15,800-file Obsidian vault into a searchable index for Claude Code. Stack is Model2Vec (potion-base-8M, 256-dimensional embeddings) + sqlite-vec for vector search + FTS5 for BM25, combined via Reciprocal Rank Fusion. The database is 49,746 chunks in 83MB. RRF is the important piece: it merges ranked lists from both retrieval methods without needing score calibration, so you get BM25's exact-match precision on identifiers and function names plus vector search's semantic matching on descriptions and error context.

The incremental indexing matters too. If you're indexing tool outputs per-session, the corpus grows fast. My indexer has a --incremental flag that hashes content and only re-embeds changed chunks. Full reindex of 15,800 files takes ~4 minutes; incremental on a typical day's changes is under 10 seconds.

On the caching question raised upthread: this approach actually helps prompt caching because the compressed output is deterministic for the same query. The raw tool output would be different every time (timestamps, ordering), but the retrieved summary is stable if the underlying data hasn't changed.

One thing I'd add to Context Mode's architecture: the same retriever could run as a PostToolUse hook, compressing outputs before they enter the conversation. That way it's transparent to the agent, it never sees the raw dump, just the relevant subset.

thecopy10 hours ago | parent | next

Very interesting, one big wrinkle with OP:s approach is exactly that, the structured responses are un-touched, which many tools return. Solution in OP as i understand it is the "execute" method. However, im building an MCP gateway, and such sandboxed execution isnt available (...yet), so your approach to this sounds very clever. Ill spend this day trying that out

loading story #47209380

danw197914 hours ago | parent | next

Would love to read a more in depth write up of this if you have the time !

I suspect the obsessive note-taker crowd on HN would appreciate it too.

tclancy10 hours ago | parent

Seconded that I would love to see the what, why and how of your Obsidian work.

#visit	12,938,124
#session	74,665
#live-session	0