Warp_cache – SIEVE cache in Rust for Python, 25x faster than cachetools
SIEVE cache beats LRU with one-line swap, but only matters if you're bottlenecked on cache.
Wrap Python functions and shell commands as content-addressed transformations. Cache results, run them locally or on a cluster, and share them by checksum.
Checksum-based computation identity beats Make and DVC for reproducible pipelines.
Data scientists and research engineers building reproducible pipelines
Nix · DVC · Make
It started as a hobby project — I had an itch about programming not being at-your-fingertips enough. Then I applied it to my work as a bioinformatics research engineer. The early versions focused on interactive workflows. After a year or two I realized that to make interactivity work properly, you need really good DAG tracking, so checksums were added everywhere. My lab built a collaborative web server with it that we published. More recently I've rebuilt it around the command line, persistent caching, and remote deployment.
It's still in alpha, but the core is usable.
Core idea: same code + same inputs = same result, identified by checksum. If you've already computed it, you don't compute it again.
Two entry points:
Python:
from seamless.transformer import direct
@direct def add(a, b): import time time.sleep(5) return a + b
add(2, 3) # runs, caches result add(2, 3) # cache hit — instant
Bash:seamless-run 'seq 1 10 | tac && sleep 5' # runs, caches result seamless-run 'seq 1 10 | tac && sleep 5' # cache hit — instant
With persistent caching enabled, results are stored as checksum-to-checksum mappings in a small SQLite database that can be shared with collaborators, so that they get cache hits too.Execution scales by changing config, not code: in-process, spawned workers, or a Dask-backed HPC cluster.
Remote execution also doubles as a reproducibility test. If your code produces the same result on a clean worker, it's reproducible. If not, Seamless helped you find the problem — whether it's a missing dependency, an undeclared input, or a platform sensitivity.
Built for scientific computing and data pipelines, but works for anything pipeline-shaped.
SIEVE cache beats LRU with one-line swap, but only matters if you're bottlenecked on cache.
Caches bloated MCP responses and lets agents query with jq, saving real tokens.
Scheduled AI agents with per-run cost tracking beat generic chat wrappers.
Accessibility tree beats screenshot tokens, per-step model control is genuinely clever.
Content-hash caching beats git-diff approaches for incremental i18n translation.
Standalone KV cache compression script implementing TurboQuant with 1.55x ratio.