Brf.it – Extracting code interfaces for LLM context

Name: Brf.it – Extracting code interfaces for LLM context
Availability: InStock
Author: jeff-lee

by jeff-lee·Mar 7, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●SolidSolve My ProblemBig Brain

Tree-sitter interface extraction cuts token usage by 6x, but chat context window optimization is becoming table stakes.

Strengths

•Tree-sitter AST parsing avoids regex fragility; accurate across Go, TypeScript, Python, and 15+ languages.
•Dual XML/Markdown output with automatic token counting lets you measure and optimize LLM context efficiently.
•Gitignore-aware filtering and cross-platform CLI (Homebrew, curl, PowerShell) ship ready for immediate use.

Weaknesses

•Chat context optimization is increasingly table stakes; Cursor, Continue, and Codebase Copilot already handle this natively.
•No integration with actual LLM APIs or IDEs—you still manually copy-paste the output into your chat.

Post Description

I've been experimenting with ways to make AI coding assistants more efficient when working with large codebases.

The problem

When we give repository context to LLMs, we often send full files and implementations. But for many tasks (like understanding architecture or navigating a repo), the model doesn't actually need most of that.

This leads to two issues: - unnecessary token usage - noisy context

The idea

Instead of sharing the full implementation, what if we only shared the interface surface of the code?

Function signatures, types, imports, and documentation — basically the structure of the system rather than the implementation details.

The experiment

I built a small CLI tool called Brf.it to test this idea. It uses Tree-sitter to parse code and extract structural information.

Example output:

<file path="src/api.ts"> <function>fetchUser(id: string): Promise<User></function> <doc>Fetches user from API, throws on 404</doc> </file>

In one example from a repo, a ~50 token function compresses to about ~8 tokens when reduced to just its signature and documentation.

The goal isn't to replace sharing full code, but to provide a lightweight context layer for things like: - architecture understanding - repo navigation - initial prompt context for AI agents

Inspired partly by repomix, but with a different approach: instead of compressing the full repo, it extracts the API-level structure.

Language support so far: Go, TypeScript, JavaScript, Python, Rust, C, C++, Java, Swift, Kotlin, C#, Lua

Project: https://indigo-net.github.io/Brf.it/

Curious if others have tried similar approaches.

What information do you think is actually essential for LLM code understanding? Are function signatures + docs enough for architecture reasoning? Are there formats that work better for LLM consumption?