Skip to content

Live Backups with btrbk

Introduction

Recently, both a homelab friend of mine and myself have been diving into the wonderful world of self hosted photos with Immich. Immich is fantastic, good enough that I consider it a valid replacement to google photos. Which is great because they are constantly trying to get me to spend money to upgrade my storage, and I hate feeling like I have a limit on the photos or videos I want to take.

However, there's a problem: relying on Immich means that the homelab data just got upgraded from "well that's annoying" when lost to "relationship altering event" if it goes up in smoke. Unacceptable!

As such, it's time to review the disaster recovery and backup mechanisms for my homelab.

TL;DR

In the first of a three part guide, the three guides will go into the following:

  • Utilising filesystem level snapshot and replication with btrbk (You are here)
  • Leveraging ransomware resistant backups with btrbk (pending)
  • Setting up a cross-site, ransomware resistant backups with wg-easy and Kopia (pending)

Prerequisites

To follow this first part of the guide, you will need:

  • A linux computer running docker. This will be Fedora 39 for the guide
  • A backup target, such as a USB drive (that you are comfortable erasing)

The backup requirements

The first step of any successful project is to lay out the requirements. Not just so we understand what we are trying to accomplish, but so we can have a success state when we get there. What do we want out of our local backups? (let's hold off on the offsite question for now)

  • The capability to create point-in-time snapshots of live data so we can create consistent (and thus restorable) backups (this is referred to as backup atomicy)
  • The ability to target specific folders and subfolders (as we are expecting to keep our configuration and data within a single set of directories, with docker of course!)
  • The actual backups to be consistent, recoverable, and resilient
  • The backup process to be space efficient
  • The backup process to (preferably) be quick
  • The backup target to not be destructible from the backup source (in the event of ransomware or fat fingering, this is for the next article)

These are all problems we can solve with btrbk!

What is Snapshotting?

Before we can talk about the backup software, we need to discuss a larger problem, which is backup atomicy. Backup atomicy refers to the problem of: what if my data changes mid backup? Especially for chattier systems (such as databases), such an event is not just likely, but also likely to corrupt the backup in the process. There are several ways to tackle this problem:

  • Turning everything off before backing it up
  • Snapshotting the whole environment (if virtualized) and backing up at the hypervisor level
  • Using a filesystem that understands folder snapshotting as a function (referred to as copy-on-write filesystems) to take a point-in-time snapshot and copy that instead

All three of these options will provide the result we need. Turning everything off isn't ideal since it requires interrupting all the applications, and results in additional complexity and downtime. Backing up at a hypervisor is acceptable and quite common, however problematic in that it encourages lazy backups. If you start relying on backing up everything including the kitchen sink, you will stop understanding the actual recovery process if the operating system gets hosed. It's also problematic in that if you aren't running inside a virtual machine, then you can't do it at all.

With that in mind, then relying on a copy-on-write filesystem for snapshots seems ideal, especially if we're keeping everything contained within a single folder structure. But there's a problem! The "default" linux filesystem isn't copy-on-write!

ZFS or BTRFS

The default linux filing system is typically ext4 or xfs, and both have the advantage of being mature and relatively simple compared to copy-on-write filesystems. But they don't have what we need, which is that snapshotting capability.

So what does? Well, ironically Windows has been using a copy-on-write filing system this whole time! NTFS (and now ReFS) are both snapshot capable filesystems. But… we like linux here. There are two popular CoW filing systems available for Linux, referred to as btrfs and ZFS. All popular linux distros will have btrfs support built in, as it's a filing system built into the kernel. Most popular linux distros will support ZFS through their package manager, but as it runs an incompatible license to the linux kernel, it is typically not enabled by default. With some exceptions, as seen below:

Both ZFS and btrfs can be considered stable (albeit btrfs has some edge cases with raid-5/6), and one isn't necessarily "better" for the purpose we require. However since btrfs does have first class support in Fedora, and has general better default compatibility in Linux, that's what we will stick with.

Installing btrbk

Alright, let's get stuck into it! btrbk is a wrapper around the btrfs tooling to perform snapshots and replications very efficiently at the filing system level. This is ideal as it is extremely fast, extremely space efficient, and very simple to recover from (it's just a file copy away). However if all this talk of filing systems hasn't made it obvious, it will only work for btrfs sources and (mostly) btrfs targets.

So what does our server look like? Well if you're familiar with my guides, I begin most of my guides having SSHed into a fresh fedora installation (+ docker) with visual studio code.

Info

you can read more about this setup here. Note that this guide does run commands as root by default — if not running as root, keep this in mind and use sudo where required. Most commands ran here will require root.

Because I am using Fedora (and have chosen during partitioning to use a btrfs partition layout), I am ready to get going. However, if you are not using a btrfs root drive, you will have to partition and/or mount a btrfs drive in the location you will want to store your container data (for myself, /mnt/containers).

Choosing the btrfs drive configuration in fedora server

Choosing the btrfs drive configuration in fedora server

If you are using a btrfs root system (like this guide), you will need to create what is called a btrfs subvolume. This is similar to a folder, except it tells the btrfs filing system that it is a special folder to target for snapshot purposes. You can do this with the following.

btrfs subvolume create /mnt/containers

We also have to create a subvolume in that subvolume to hold the snapshots we end up taking.

btrfs subvolume create /mnt/containers/@snapshots

Info

the @ convention for snapshot folders was started by OpenSUSE.

Feel free to open vscode at the folder you created as well.

The expectation is to use /mnt/containers as your source location for all of your bind mounts and your docker containers, treating it as a normal folder in Linux. For now, let's assume we have created some container folders and set up their respective docker-compose and bind mounts within. our example will use a caddy and immich container setup to be backed up.

Warning

If using the default Immich compose file, the database volume does not get included as a bind mount. I highly recommend changing the database volume to a bind mount within the docker-compose folder so it actually gets backed up.

Go ahead and install btrbk now. If using Fedora, it's easy!

dnf install -y btrbk

Now we need to add a default configuration in /etc/btrbk/btrbk.conf for snapshotting.

Info

This shell snippet below will automatically overwrite the file without having to manually edit the file. You can read more about btrbk config files (they aren't the most intuitive) here. you may need to modify the volume and subvolume if you have mounted an external btrbk partition.

cat > /etc/btrbk/btrbk.conf << '_EOF_'
snapshot_dir               mnt/containers/@snapshots
snapshot_create            onchange

snapshot_preserve          8h 7d 0w 1m 1y
snapshot_preserve_min      latest
target_preserve            8h 7d 0w 1m 1y
target_preserve_min        latest

volume /
  subvolume mnt/containers
_EOF_

Let's give it a go! Just run btrbk run. It should run and finish (if all is configured) instantly. If all went well, you can navigate to your snapshots folder and see an instant copy of all of your data from /mnt/containers. Awesome!

Backing up to an External Target

Alright, well this is actually great, we now have a way to instantly make a copy of our running containers, perfectly preserved. We can also keep re-running our backup job, and every time btrbk will maintain or remove old snapshots according to the snapshot_preserve policy in the config file we made.

Info

We defaulted to 8 hourly, 7 daily, 1 monthly, and 1 yearly, which runs a pretty good gamut of coverage.

But snapshotting isn't backing up. It's still on the same source! So let's sacrifice a usb external drive now, and format one we had lying around to btrfs

lsblk
mkfs.btrfs -f <your-drive-path>

Warning

Formatting a drive erases the drive. Do not erase a drive with data you care about, it will be gone. Also Windows is not aware of btrfs as a filing system, the drive will show as blank in Windows.

Mount the drive in a place of your choice, we will use /mnt/backup. You can do this fumbling around with /etc/fstab (or doing it temporarily with the mount command), but I find it easier with cockpit.

  • create a subvolume to back up in /mnt/backup
btrfs subvolume create /mnt/backup/containers
  • modify your btrbk.conf file to include /mnt/backup as a target
cat > /etc/btrbk/btrbk.conf << '_EOF_'
snapshot_dir               mnt/containers/@snapshots
snapshot_create            onchange

snapshot_preserve          8h 7d 0w 1m 1y
snapshot_preserve_min      latest
target_preserve            8h 7d 0w 1m 1y
target_preserve_min        latest

volume /
  target /mnt/backup/containers
  subvolume mnt/containers
_EOF_

rerun the backup!

btrbk run

This one will take a bit longer because it's actually copying all the data across (but it will run at basically full disk speeds). Once done, have a look in /mnt/backup/containers!

Info

Hey, not only did it copy the latest snapshot, it also copied the previous snapshots that were covered in the maintenance config! Awesome!

Even better, now all future snapshots will be incremental at the filesystem block level. It's incredibly fast, and transparent at the file level. As far as we can tell, we get a complete copy of our file structure every time, even though the incremental magic is happening at the btrfs level!

Scheduling btrbk

We're almost there. We're now making point in time snapshots of the actively running containers, moving them across to our USB drive, and doing so extremely space and time efficiently. But we haven't scheduled anything yet. Luckily, that isn't too hard either.

  • Create the systemd service
cat > /etc/systemd/system/btrbk-hourly.service << '_EOF_'
[Unit]
Description=btrbk-hourly

[Service]
Type=oneshot
ExecStart=/bin/btrbk run
WorkingDirectory=/root
_EOF_

cat > /etc/systemd/system/btrbk-hourly.timer << '_EOF_'
[Unit]
Description=btrbk-hourly

[Timer]
OnCalendar=hourly
Persistent=true

[Install]
WantedBy=timers.target
_EOF_
  • Test the same backup and check the logs
systemctl daemon-reload
systemctl start btrbk-hourly
journalctl -u btrbk-hourly

  • Turn on the automatic timer
systemctl enable btrbk-hourly.timer --now

And we're good to go. Now the backup service will start on boot, run hourly, and will provide system logs/errors if it fails (which it shouldn't unless the external drive is unplugged).

Moving On

If that's all you looking for, you're actually good to go. You now have a very efficient, very fast backup system that can safely back up your running docker containers without even turning them off. However, should your host get compromised, you're still at risk (nothing them stopping from wiping your computer and your external drive). next up (in the next pending article), we'll discuss how we can perform ransomware resistant backups with btrbk. Until then, enjoy!