Back to browse
See – searchable JSON compression (offline 10-min demo)

See – searchable JSON compression (offline 10-min demo)

by Tetsuro·Feb 21, 2026·5 points·0 comments

AI Analysis

●●SolidBig BrainWizardry

Schema-aware JSON compression stays searchable; reaches 7.7% vs Zstd's 13.7%.

Strengths
  • Novel compression architecture combining structure, delta, dictionary, Bloom filters, and skip indexes
  • Rigorous benchmarking with latency percentiles (p50/p95/p99) and skip-rate metrics
  • Proof-first evaluation model eliminates vaporware risk—demo ZIP + audit pack published
Weaknesses
  • Positioned as acquisition/enterprise only; unclear open-source licensing or commercial path
  • Niche use case—only valuable if you're storing massive JSON and can integrate custom decompress logic
Target Audience

Data engineers, observability teams, storage optimization specialists

Similar To

Zstandard (Zstd) · Apache Arrow · Protocol Buffers

Post Description

Hi HN, I’m building SEE (Semantic Entropy Encoding): a searchable compression format for JSON/NDJSON. Goal: reduce the “data tax” (storage/egress) and “CPU tax” (decompress/parse) by keeping JSON searchable while compressed, with page-level random access.

I just published a proof-first evaluation release:

Offline DEMO ZIP (~10 min): prints compression ratios + skip rates + lookup latency (p50/p95/p99)

DD pack: audit/repro evidence (decode mismatch=0, extended mismatch=0, audit PASS)

Latest release: https://gitlab.com/kodomonocch1/see_proto/-/releases

Direct DEMO ZIP: https://gitlab.com/api/v4/projects/79686944/packages/generic...

OnePager is included in the release assets.

I’d love feedback on:

what workloads you’d try this on, and

what integration path would make this compelling vs Zstd + external indexing.

Similar Projects