Databasus v3.16.0 – new release of self-hosted PostgreSQL backup tool
240k Docker pulls and most-starred backup repo, but GFS retention is the only novelty here.
PostgreSQL backup tool with Point-In-Time-Recovery and restore verification
PG 17 native protocol beats the old agent-based approach pgBackRest uses.
DevOps engineers, Database administrators
pgBackRest · Barman · WAL-G
The first version of physical backups was built over a backup agent. Users needed to install an agent (Go binary) on the host with a database, then this agent was executing pg_basebackup, was reading WAL-segments and was pushing them to the Databasus instance.
This implementation appeared to be a mistake so we removed it. Now physical backups (including incremental) and WAL-streaming are performed remotely over PG 17+ native protocol.
Why it was a mistake:
1) First of all, it was a naive implementation of PITR: just streaming WAL-segments is not enough to achieve low RTO (Recovery Time Objective), because DB may have to reply a week of WAL-segments.
2) Secondly, we introduced an agent to solve the particular task: backup DBs without exposing them publicly. It appeared, that the solution in 99% of cases is to put Databasus itself in private network or connect via SSH\bastion. So an agent was an overengineered solution for the "not exposing DB" task.
3) Operational complexity and bad UX. It was hard for us to develop, test and maintain the agent as a separate tool. For users it was hard to install both Databasus and agent. There were issues in keeping it in the background, making it UX-friendly (that is also important even for CLI tools).
So now Databasus relies fully on PG 17+ native backups (PG 17 introduced WAL-summary that made it possible to do incremental backups remotely). It gives Databasus:
1) Improved operational simplicity, because there is no need for the agent at all. All backups are performed remotely. For users, there is no need to install something alongside with the DB (so even cloud physical backups are possible now). For us no need to maintain a separate piece of the project, handle edge cases, harm UX and care about integration between an agent and the main instance. The less moving pieces to configure, the smaller area for mistakes.
2) Improved RTO: now PITR achieved via PG 17 incremental backups via WAL-summary. So over restore you reply WALs only from the latest incremental backup instead of the latest full backup. In the past approach was to make a full backup once a week and then stream WALs. Now approach is to make full backups once a week, incremental once a day and then stream WALs.
3) Improved reliability, because there is no reinvention of backup mechanisms. Before backup tools like pgBackRest or WAL-G had to build their own incremental backups implementation, and then test it under all edge-cases. We decided to rely on native implementation that appeared in PG 17. From one side, we support PITR only for PG 17 (for earlier we have logical backups). From another side, it improves our reliability, we don't have to reinvent own implementations and now we are fully relying on battle-tested tooling. Moreover, right now PG 17 and higher is used in ~50% PostgeSQL installations in the world, in 2 years it will be ~80%-90%.
Those decisions with explanations and trade-offs are fixed in ADRs:
- Usage of native PG 17 backups instead of custom implementation - https://github.com/databasus/databasus/blob/main/adr/0008-wh...
- Usage of remote backups instead of agents - https://github.com/databasus/databasus/blob/main/adr/0009-wh...
- Why we moved away from the agent and decided to use native PG 17 backups - https://databasus.com/faq
Also thank you, Product Hunt community, for support! A bit more than a year ago, Databasus has been published here and received first stars. Now the project has ~7.6k stars on GitHub and over 1 million of the Docker Hub pulls. It's just a start! :)
240k Docker pulls and most-starred backup repo, but GFS retention is the only novelty here.
Gzip breaks dedup; this stores uncompressed snapshots on BTRFS for 85% savings.
Nice little CLI: one-liner install and an interactive 'clawstash setup' get you an hourly daemon that auto-downloads restic and uploads AES-256 encrypted, deduplicated blocks to any S3-compatible store. It's pragmatic and tightly scoped — excellent if you run OpenClaw, but mostly a focused wrapper around restic rather than a novel backup system.
Neon-like branching for self-hosters, but explicitly admits it's not for critical production workloads.
Visual pipeline builder beats stitching together shell scripts and cron jobs.
Docker-friendly database backup UI, but Veeam and pg_dump cover these cases.