Veles – Hybrid (BM25 and semantic) local code search MCP, in Rust
Pure Rust CPU-only code search with persistent index beats transformer-heavy alternatives.
95% fewer tokens than grep for PyTorch retrieval—agent-first code search with proven benchmarks.
AI coding agents and developers working in large codebases requiring semantic code navigation
Sourcegraph Cody · Continue.dev · OpenAI Code Interpreter
The goal is to reduce noisy retrieval loops and token waste in real repositories. cgrep combines BM25 + tree-sitter symbol awareness, with optional semantic/hybrid search, and returns deterministic JSON for agent workflows.
What it does: - Code navigation: definition, references, callers, dependents - Focused context tools: read, map - Agent flow: `agent locate` -> `agent expand` (small payload first, expand only selected IDs) - MCP support: `cgrep mcp serve` + host install helpers - Agent install support: claude-code, codex, copilot, cursor, opencode
Benchmark snapshot (PyTorch, 6 implementation-tracing scenarios): - Baseline (`grep`) tokens-to-complete: 127,665 - cgrep (`agent locate/expand`) tokens-to-complete: 6,153 - 95.2% fewer tokens (20.75x smaller) - Avg retrieval latency to completion: 1321.3ms -> 22.7ms (~58.2x faster after indexing)
Links: - Repo: https://github.com/meghendra6/cgrep - Docs: https://meghendra6.github.io/cgrep/ - Benchmark method/results: https://meghendra6.github.io/cgrep/benchmarks/pytorch-agent-...
I’d really appreciate feedback on: - Real-world agent workflows I should benchmark next - MCP/agent integrations I should add - Cases where cgrep retrieval quality still falls short
Pure Rust CPU-only code search with persistent index beats transformer-heavy alternatives.
Tree-sitter dependency graph saves 5,000-20,000 tokens per agent query vs exploration.
Tree-sitter + FTS5 + MCP = tokens saved for AI agents to actually code, not search.
Instruction tuning on tool descriptions cut Sonnet costs 29% without code changes.
Tree-sitter + SQLite graph reduces agent context 74% while staying entirely local.
Tree-sitter + Rhai scripts replace opinionated formatters, but beta stability and language coverage remain questions.