MCP-compress-router – MCP Compressor

Name: MCP-compress-router – MCP Compressor
Availability: InStock
Author: ameshkov

by ameshkov·Jun 29, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●SolidSolve My ProblemNiche Gem

Cuts MCP token bloat from 26K to minimal overhead with two-tool proxy architecture.

Strengths

•Concrete cost math showing $0.93 savings per 50-turn session with three MCP servers
•Simple two-tool interface (get_tool_schema, invoke_tool) replaces full tool descriptions
•Supports OAuth flows including GitHub and Figma MCP special cases

Weaknesses

•Only valuable for users running multiple MCP servers simultaneously
•MCP ecosystem still emerging, limiting current addressable audience

Post Description

When you have multiple MCP servers, every request to the LLM will include all of their tools and descriptions, which can quickly eat up your token limit and increase costs. The thing is, most of the time, you don't need all of them.

For example, let’s take three popular MCP servers: Notion, GitHub, and Pylance. The overhead they create on every turn is about 26K tokens. If we assume an average 50-turn coding session and Opus pricing, the overhead for a single session is about $0.9275.

`mcp-compress-router` does something very simple: it proxies all MCP servers with just two tools: `get_tool_schema` and `invoke_tool`. `invoke_tool` proxies the call to the downstream MCP server. The `get_tool_schema` description lists the tool names and arguments for all downstream MCP server tools so that the agent knows what's available. Whenever it needs a tool, it first calls `get_tool_schema` to read the full description and argument schema, and then calls `invoke_tool`.

The savings are pretty serious. The example of 3 MCP servers is compressed to 900 tokens with the "max" compression level (just tool names), or to about 2000 tokens with the "high" compression level (the default one: tool names plus argument names). So you'll be saving 90%+ this way.