The core idea: every MCP tool call dumps raw data into your 200K context window. Context Mode spawns isolated subprocesses — only stdout enters context. No LLM calls, purely algorithmic: SQLite FTS5 with BM25 ranking and Porter stemming.
Since the last post we've seen 228 stars and some real-world usage data. The biggest surprise was how much subagent routing matters — auto-upgrading Bash subagents to general-purpose so they can use batch_execute instead of flooding context with raw output.
Source: https://github.com/mksglu/claude-context-mode Happy to answer any architecture questions.
In connecting the dots (and help me make sure I'm connecting them correctly), context-mode _does not address MCP context usage at all_, correct? You are instead suggesting we refactor or eliminate MCP tools, or apply concepts similar to context_mode in our MCPs where possible?
Context-mode is still very high value, even if the answer is "no," just want to make sure I understand. Also interested in your thoughts about the above.
I write a number of MCPs that work across all Claude surfaces; so the usual "CLI!" isn't as viable an answer (though with code execution it sometimes can be) ...
Edit: typo
Edit: clarify "MCP tool"
Confirmed: context-mode cannot intercept MCP tool responses. The PreToolUse hook (hooks/pretooluse.sh) matches only Bash|Read|Grep|Glob|WebFetch|WebSearch|Task. When I called my obsidian MCP's obsidian_list via MCP, the response went straight into context — zero entries in context-mode's FTS5 database. The web fetches from the same session were all indexed.
The context-mode skill (SKILL.md) actually acknowledges this at lines 71-77 with an "after-the-fact" decision tree for MCP output: if it's already in context, use it directly; if you need to search it again, save to file then index. But that's damage control — the context is already consumed. You can't un-eat those tokens.
The architectural reason: MCP tool responses flow via JSON-RPC directly to the model. There's no PostToolUse hook in Claude Code that could modify or compress a response before it enters context. And you can't call MCP tools from inside a subprocess, so the "run it in a sandbox" pattern doesn't apply.
So the 98% savings are real but scoped to built-in tools and CLI wrappers (curl, gh, kubectl, etc.) — anything replicable in a subprocess. For third-party MCP tools with unique capabilities (Excalidraw rendering, calendar APIs, Obsidian vault access), the MCP author has to apply context-mode's concepts server-side: return compact summaries, store full output queryably, expose drill-down tools. Which is essentially what you suggested above.
Still very high value for the built-in tool side. Just want the boundary to be clear.
Correct any misconceptions please!
@dang it's really bad lately.