GitHub Repository

Wrap Python functions and shell commands as content-addressed transformations. Cache results, run them locally or on a cluster, and share them by checksum.

24 starsShell

Seamless – Content-addressed computation caching for Python and bash

Name: Seamless – Content-addressed computation caching for Python and bash
Availability: InStock
Author: sjdv1982

by sjdv1982·Apr 17, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●●BangerBig BrainWizardry

Checksum-based computation identity beats Make and DVC for reproducible pipelines.

Strengths

•Content-addressed identity means identical computations auto-deduplicate across machines
•SQLite database makes sharing cached results as simple as copying a file
•Wraps both Python functions and shell commands without code changes

Weaknesses

•Alpha stage — interactive features from 0.x still being ported to new architecture
•92 open issues suggests rough edges remain for production use

Post Description

Hey HN, Sjoerd de Vries here. I have worked on Seamless for nearly 10 years now. It has been used in my lab, but I was always around for troubleshooting. This is the first time that I think it's ready to stand on its own. I would love to hear your thoughts about it.

It started as a hobby project — I had an itch about programming not being at-your-fingertips enough. Then I applied it to my work as a bioinformatics research engineer. The early versions focused on interactive workflows. After a year or two I realized that to make interactivity work properly, you need really good DAG tracking, so checksums were added everywhere. My lab built a collaborative web server with it that we published. More recently I've rebuilt it around the command line, persistent caching, and remote deployment.

It's still in alpha, but the core is usable.

Core idea: same code + same inputs = same result, identified by checksum. If you've already computed it, you don't compute it again.

Two entry points:

Python:

from seamless.transformer import direct

@direct def add(a, b): import time time.sleep(5) return a + b

add(2, 3) # runs, caches result add(2, 3) # cache hit — instant

Bash:

seamless-run 'seq 1 10 | tac && sleep 5' # runs, caches result seamless-run 'seq 1 10 | tac && sleep 5' # cache hit — instant

With persistent caching enabled, results are stored as checksum-to-checksum mappings in a small SQLite database that can be shared with collaborators, so that they get cache hits too.

Execution scales by changing config, not code: in-process, spawned workers, or a Dask-backed HPC cluster.

Remote execution also doubles as a reproducibility test. If your code produces the same result on a clean worker, it's reproducible. If not, Seamless helped you find the problem — whether it's a missing dependency, an undeclared input, or a platform sensitivity.

Built for scientific computing and data pipelines, but works for anything pipeline-shaped.

Similar Projects

Developer Tools●●Solid

Seamless: content-addressed computation and caching

Content-addressed caching for Python and shell with checksum-based result sharing.

Big BrainRabbit Hole

sjdv1982

201mo ago

AI/ML●●●Banger

Heddle, content-addressed contracts for spec-driven agent loops

Content-addressed contracts cut agent context from thousands to hundreds of tokens.

Big BrainWizardry

davet47

3026d ago

AI/ML●●Solid

Self tuning chat exposing it's semantic and agentic cache

Tool result caching for agents when GPTCache and LangChain already do semantic caching.

Ship ItNiche Gem

kivanowbetterdb

401mo ago

Developer Tools●●●Banger

Warp_cache – SIEVE cache in Rust for Python, 25x faster than cachetools

SIEVE cache beats LRU with one-line swap, but only matters if you're bottlenecked on cache.

WizardryBig Brain

tolopalmer

204mo ago

Developer Tools●●Solid

Llmbuffer – Python library for cache-optimized LLM conversation history

Byte-stable prefix organization beats naive message concatenation for cache hits.

Big BrainShip It

scottmp10

101mo ago

AI/ML●●Solid

Structured Python control over AI computer use agents

Accessibility tree beats screenshot tokens, per-step model control is genuinely clever.

Big BrainWizardry

aadyachinubhai

143mo ago