Back to browse
GitHub Repository
3 starsPython

UMC – Lossless compression that beats lzma by 7-46% on numeric data

by gunnerlevi·Mar 2, 2026·3 points·0 comments

AI Analysis

●●●BangerWizardryBig BrainSolve My Problem

Search compressed data without decompressing—beats lzma on numeric arrays.

Strengths
  • Genuine two-in-one value: compression *and* searchable index eliminates decompress-to-search bottleneck.
  • Multiple modes (lossless, near-lossless, quantized) with clear tradeoffs and provable optimality option.
  • Lightweight install with numpy-only dependency; full neural search optional—respects user's build simplicity.
Weaknesses
  • Specialized to numeric/structured data; general-purpose compressors still handle arbitrary binaries better.
  • Limited ecosystem evidence; no public benchmarks against specialized time-series databases (InfluxDB, Timescale).
Target Audience

Data engineers, researchers, anyone managing large numeric datasets (time series, sensor data, images).

Similar To

zstd · lzma · ClickHouse (columnar + compression)

Similar Projects

AI/ML●●●Banger

UltraCompress – first mathematically lossless 5-bit LLM compression

Runs 405B model compression on a single 32GB GPU when others need enterprise clusters.

WizardryBig Brain
mounnar
6026d ago