Wove 2.0 "Beautiful Python async" adds inline Celery execution
Implicit parallelism from function signatures beats manual asyncio.gather spaghetti.
Just Bunch of Filesystems -- management tools for a symlink-manager JBOD setup
Explicit file placement avoids pooled filesystem performance surprises for large media.
Sysadmins and data engineers managing large media storage
mergerfs · UnionFS · LVM
I created `jbofs` ("Just Bunch of File Systems") as an experiment to workaround some recurring issues I have run into with suprising filesystem performance.
My use case is storing large media (think pcaps, or movie files), where some hierarchal organization is still useful.
Whether it’s NFS mounts (and caches), ZFS, or RAID (mis)configurations, I’ve run into surprising(ly bad) performance on many occasions. Doubtless this is largely user error, but it can be hard to diagnose what went wrong, and I’ve resorted to things like copying a file to `/tmp` or some other local mount with a simple ext4/XFS filesystem that I understand. When I see r/w happening at 200MB/s but know that 66GB/s is possible[1], it can be quite disheartening
I’ve wanted something dead simple which peels back the curtains and provides minimal abstraction (and overhead) atop raw block devices. I’ve messed around with FUSE a bit, and did some simple configuration experiments on my machine (a workstation), but came back to wanting less, not more. I did do some RTFM with immediate alternatives[2], but could have missed something obvious -- let me know!
As a compromise to avoid implementing my own filesystems, I built this atop existing filesystems.
The idea is pretty simple -- copy files to separate disks/filesystems, and maintain a unified “symlink” view to the various underlying disks. Avoid magic and complication where possible. Keep strict filesystem conventions.
Of course, this is not a filesystem. Maybe that’s a bad thing -- which is one thing I’m trying to figure out with this experiment.
If you have experience using anything like the other filesystems or similar stuff, would love to get your feedback, and especially thoughts about why this symlink thing is not the way to go!
Lastly, thanks for taking the time to look at this!
[1] https://tanelpoder.com/posts/11m-iops-with-10-ssds-on-amd-th...
[2] https://github.com/aozgaa/jbofs/blob/main/docs/comparison.md
Implicit parallelism from function signatures beats manual asyncio.gather spaghetti.
File-level symlinks let you manage configs inside shared dirs like ~/.config/ without hijacking everything.
Uses the ancient Netscape HTML format to avoid vendor lock-in and sync headaches.
Kerberos zero-touch enrollment beats manual token workflows in Landscape or SSSD.
Chess960 with double randomization, but solving a problem for a tiny audience.
Reflink strategy for directories with copy fallback is clever — raw git worktree doesn't do this.