The real problem isn't MCP vs CLI, it's that MCP originally loaded every tool definition into context upfront. A typical multi-server setup (GitHub, Slack, Sentry, Grafana, Splunk) consumes ~55K tokens in definitions before Claude does any work. Tool selection accuracy also degrades past 30-50 tools.
Anthropic's Tool Search fixes this with per-tool lazy loading - tools are defined with defer_loading: true, Claude only sees a search index, and full schemas load on demand for the 3-5 tools actually needed. 85% token reduction. The original "everything upfront" design was wrong, but the protocol is catching up.
But then I wonder, where the balance is between a bunch of small tool calls, vs one larger one.
I recall some recent discussion here on hn on big data analysis
The problem with MCP is that you are paying the price in token usage, even if you are not using the MCP server. Why would anybody want that?
And no, the tool search function recently introduced by Anthropic does not completely solve this problem.