cgrep – local, code-aware search for AI coding agents

Name: cgrep – local, code-aware search for AI coding agents
Availability: InStock
Author: meghendra

by meghendra·Feb 14, 2026·2 points·0 comments

Visit Project View on HN

AI Analysis

●●●BangerSolve My ProblemWizardryBig Brain

95% fewer tokens than grep for PyTorch retrieval—agent-first code search with proven benchmarks.

Strengths

•Quantified agent efficiency win: 20.75x token reduction, 58.2x latency on real PyTorch workflows
•Tree-sitter awareness + BM25 hybrid avoids naive semantic-only overhead
•MCP integration + agent install helpers make adoption frictionless across Codex, Cursor, VSCode

Weaknesses

•Narrow audience: primarily valuable for AI agent loops, less obvious utility for human workflows
•No data on adoption or whether agents actually use it at scale in production

Post Description

Hi HN — I built cgrep, a local-first, code-aware search tool for AI coding agents (and humans).

The goal is to reduce noisy retrieval loops and token waste in real repositories. cgrep combines BM25 + tree-sitter symbol awareness, with optional semantic/hybrid search, and returns deterministic JSON for agent workflows.

What it does: - Code navigation: definition, references, callers, dependents - Focused context tools: read, map - Agent flow: `agent locate` -> `agent expand` (small payload first, expand only selected IDs) - MCP support: `cgrep mcp serve` + host install helpers - Agent install support: claude-code, codex, copilot, cursor, opencode

Benchmark snapshot (PyTorch, 6 implementation-tracing scenarios): - Baseline (`grep`) tokens-to-complete: 127,665 - cgrep (`agent locate/expand`) tokens-to-complete: 6,153 - 95.2% fewer tokens (20.75x smaller) - Avg retrieval latency to completion: 1321.3ms -> 22.7ms (~58.2x faster after indexing)

Links: - Repo: https://github.com/meghendra6/cgrep - Docs: https://meghendra6.github.io/cgrep/ - Benchmark method/results: https://meghendra6.github.io/cgrep/benchmarks/pytorch-agent-...

I’d really appreciate feedback on: - Real-world agent workflows I should benchmark next - MCP/agent integrations I should add - Cases where cgrep retrieval quality still falls short