Back to browse
GitHub Repository

A command-line tool for querying CSV, JSON, and Avro files using pipe-based syntax

4 starsGo

Dq – pipe-based CLI for querying CSV, JSON, Avro, and Parquet files

by razeghi71·Feb 21, 2026·3 points·0 comments

AI Analysis

●●●BangerSolve My ProblemBig Brain

jq for tables: pipe-based querying across CSV, JSON, Avro, Parquet in one tool.

Strengths
  • Solves genuine data exploration friction by replacing multi-tool workflows (avro-tools, Python, DataFusion CLI).
  • Pipe-based composable syntax is genuinely better than SQL for quick terminal work — intentional design choice, not a limitation.
  • Supports multiple tabular formats with consistent operations API — actual multi-format story, not a wrapper.
Weaknesses
  • Limited aggregation and grouping compared to SQL or DataFusion; some power-user queries may require Python fallback.
  • No cross-file joins or window functions mentioned; scoped to single-file exploration rather than full query engine.
Target Audience

Data engineers, backend developers who work with data files from terminal

Similar To

jq (for JSON) · DataFusion CLI · DuckDB CLI

Post Description

I'm a data engineer and exploring a data file from the terminal has always felt more painful than it should be for me. My usual flow involved some combination of avro-tools, opening the file in Excel or sheets, writing a quick Python script, using DataFusion CLI, or loading it into a database just to run one query. It works, but it's friction -- and it adds up when you're just trying to understand what's in a file or track down a bug in a pipeline.

A while ago I had this idea of a simple pipe-based CLI tool, like jq but for tabular data, that works across all these formats with a consistent syntax. I refined the idea over time into something I wanted to be genuinely simple and useful -- not a full query engine, just a sharp tool for exploration and debugging. I never got around to building it though. Last week, with AI tools actually being capable now, I finally did :)

I deliberately avoided SQL. For quick terminal work, the pipe-based composable style feels much more natural: you build up the query step by step, left to right, and each piece is obvious in isolation. SQL asks you to hold the whole structure in your head before you start typing.

`dq 'sales.parquet | filter { amount > 1000 } | group category | reduce total = sum(amount), n = count() | remove grouped | sortd total | head 10'`

How it works technically: dq has a hand-written lexer and recursive descent parser that turns the query string into an AST, which is then evaluated against the file lazily where possible. Each operator (filter, select, group, reduce, etc.) is a pure transformation -- it takes a table in and returns a table out. This is what makes the pipe model work cleanly: operators are fully orthogonal and composable in any order.

It's written in Go -- single self-contained binary, 11MB, no runtime dependencies, installable via Homebrew. I'd love feedback specially from anyone who's felt the same friction.

Similar Projects

Data●●●Banger

SlothDB is a super fast embedded SQL database

1.3MB WASM database beats DuckDB 5x on 10M row analytics, runs in Cloudflare Workers.

WizardryBig BrainZero to One
souravroy78
221mo ago