Deploy a DuckLake data lakehouse on Hetzner for under €10/mo
One-command DuckLake deployment on Hetzner, but orchestration of existing tools.
Access control for DuckLake lakehouses
Postgres RLS plus S3 bucket policies in one CLI for DuckLake auth.
Data engineers running DuckLake on Hetzner or similar object storage
LakeFS · Apache Ranger · Immuta
I wanted a cost-effective lakehouse on Hetzner that we can own in the EU. I wrote another repo (ducklake-hetzner) for a deployment under €15/month, but there's still a long way to go for the functionalities to come close to other data warehouses.
Hetzner's Object Storage is also not the easiest to work with, it runs Ceph, but doesn't expose IAM. That means any user has full access by default. You need to create a separate dummy project, and store the s3 credentials in there, and then use an "Allow" policy on those (as they're denied by default, this works).
To help others, I figured I'd package that into a single CLI:
dga allow alice --table customers --read-only
Does two things: PostgreSQL Row-Level Security on the DuckLake catalog, and scoped S3 bucket policies on the storage layer. Still alpha, but the core superuser/writer/reader pattern works.
Would love feedback or ideas, especially from anyone running DuckLake in production or dealing with similar access control gaps on non-AWS object storage.
One-command DuckLake deployment on Hetzner, but orchestration of existing tools.
Query Iceberg tables directly via psql without spinning up Trino or Spark clusters.
Self-hosted Neon architecture with S3 checkpoints, but explicitly not for production workloads.
Ephemeral ClickHouse on demand beats Kafka pipelines — but early access limits confidence.
Stateful, exposure-aware de-ID over time—novel framing, but repo is research-only with synthetic data.
Neon-like branching for self-hosters, but explicitly admits it's not for critical production workloads.