Story Detail of id 47198356 | Liveview Hacker News

nr3781 day ago | on: MCP server that reduces Claude Code context consumption by 98%

Oh that's quite a nice idea - agentic context management (riffing on agentic memory management).

There's some challenges around the LLM having enough output tokens to easily specify what it wants its next input tokens to be, but "snips" should be able to be expressed concisely (i.e. the next input should include everything sent previously except the chunk that starts XXX and ends YYY). The upside is tighter context, the downside is it'll bust the prompt cache (perhaps the optimal trade-off is to batch the snips).

mksglu1 day ago | parent | next

Good point on prompt cache invalidation. Context-mode sidesteps this by never letting the bloat in to begin with, rather than snipping it out after. Tool output runs in a sandbox, a short summary enters context, and the raw data sits in a local search index. No cache busting because the big payload never hits the conversation history in the first place.

lowbloodsugar16 hours ago | parent

So I built that in my chat harness. I just gave the agent a “prune” tool and it can remove shit it doesn’t need any more from its own context. But chat is last gen.

#visit	12,938,005
#session	74,665
#live-session	0