You probably want to take advantage of the data integrity checking offered by Btrfs. Btrfs calculates checksums for all data written to disk. These checksums are used to verify the data hasn’t been unduly altered. While data is verified every time it is read, what about the data that isn’t read often? How long may bit rot go unnoticed in that case? That’s the crux of this blog post which will explain how to best preserve your data on Btrfs and detect corruption early.

Scrub

To scrub you filesystem is to have all the data read from disk and validated against the stored checksums. This detects corrupt data. When coupled with redundancy such as a raid configuration, self-healing fully restores the damaged data on the disk. If you don’t use redundancy, then the scrub will alert you to the corruption so that you can restore the data manually from backups. Both Btrfs and ZFS handle scrubs in this manner.

To scrub a Btrfs filesystem use btrfs-scrub(8), and in case your interested, the equivalent ZFS command is zpool-scrub(8). Both of them also offer ways to cancel, pause, resume, and monitor scrubs. Btrfs scrubs entire filesystems at a time which is provided by a device or just any directory’s path on the target filesystem. I’m not exactly sure why it takes a directory path to anywhere on the filesystem since that seems a bit arbitrary. You should probably use either a mount point or device path to make the intended target clear.

Even if the btrfs-scrub command accepts a directory path, it doesn’t necessarily just scrub that directory. It will scrub the entire filesystem where that directory resides.

To initiate a scrub in the background, use the start subcommand followed by the path or device. Here I initiate a scrub on the device on which my root filesystem resides.

sudo btrfs scrub start (df --output=source / | tail -n 1)
scrub started on /dev/mapper/sda2_crypt, fsid 175792e7-4167-40d1-aebc-78b948d6d378 (pid=10555)

To check on the status of a scrub, use the status subcommand and the path or device. Check the status of the previous scrub like so.

sudo btrfs scrub status (df --output=source / | tail -n 1)
scrub status for 175792e7-4167-40d1-aebc-78b948d6d378
	scrub started at Fri Mar  5 06:07:42 2021, running for 00:01:25
	total bytes scrubbed: 26.19GiB with 0 errors

In many circumstances, you might want the scrub to block and return once it finishes. This is ideal for people like me who don’t want to type a status command constantly and it’s ideal for running the scrub as a command in systemd. Use the -B flag to scrub in the foreground. This command scrubs my boot partition and returns once the scrub is complete.

sudo btrfs scrub start -B /boot
scrub done for 264b42a6-a09c-40cc-b754-88926d43b395
	scrub started at Fri Mar  5 06:13:23 2021 and finished after 00:00:01
	total bytes scrubbed: 159.55MiB with 0 errors

That didn’t take long! There’s also subcommands to pause, resume, and cancel scrubs as needed.

Schedule

Scheduling regular scrubs is a necessary component of proper maintenance You can regularly run scrubs manually or automate the process of running them yet it’s critical that you monitor the results either way. If you go to the trouble to automate your scrubs you’ll want to make sure to regularly check the results. Ideally you’d use something like www.nagios.org[Nagios] for monitoring this aspect of your systems.

Don’t rely on alerts whether that is through email or desktop notifications. If they fail silently, you won’t realize when something has gone horribly wrong. Set aside time regularly to check your systems' status and health.

Arch Linux provides a handy systemd.service and systemd.timer to automate scrubs. The Btrfs maintenance toolbox provides similar functionality. We’ll take a look at the instantiable systemd units provided by Arch Linux for how to make scheduling regular scrubs a breeze. The Arch Linux Wiki’s Btrfs Scrub section has a subsection on these systemd units, Start with a service or timer. The systemd units here should be dropped in the standard system directory /etc/systemd/system.

Service

/etc/systemd/system/btrfs-scrub@.service
[Unit]
Description=Btrfs scrub on %f
ConditionPathIsMountPoint=%f
RequiresMountsFor=%f

[Service]
Nice=19
IOSchedulingClass=idle
KillSignal=SIGINT
ExecStart=/usr/bin/btrfs scrub start -B %f

This systemd.service is an instantiated service which expects that a properly escaped path is provided after the @ and before the .service extension. systemd uses special escaping rules to map filesystem paths to unit file names. The systemd-escape(1) tool makes it quite easy to convert a given path.

This service requires that the path of the service unit is indeed a mount point and that it exists with ConditionPathIsMountPoint. The argument %f represents the unescaped path used to instantiate this systemd unit. Similarly, the %i flag is the escaped version of the path used to instantiate this unit, that is the string between @ and before .service when starting the unit. RequiresMountsFor will ensure that any mount points on the given path are mounted before executing the unit.

One might opt to use {BindsTo} and {After} instead of RequiresMountsFor to define a stronger relationship to the systemd.mount unit responsible for mounting the filesystem at the given mount point. systemd mount units are usually generated automatically from entries in /etc/fstab. For this dependency relationship to work, a corresponding systemd mount unit needs to exist. You’ll want the filesystem your scrubbing to have an entry in fstab or otherwise provide the mount unit in some other way. BindsTo requires that the filesystem at the mount point be available the entire time this unit is running. If it becomes unavailable for some reason, the mount unit fails and the scrub service is killed along with it. The After keyword requires that the target be mounted before this service runs. Both of these would be set to %i.mount, the name of the corresponding systemd mount unit.

The Nice directive sets the scheduling priority to the lowest possible value, 19, giving the scrub a very low priority to avoid hogging the system CPU time. The IOSchedulingClass directive is set to idle which effectively means that the IO activity of the process shouldn’t impact normal system activity. the scrub will only use the disk when no other programs are using it. KillSignal sets the signal used to kill the process to SIGINT, i.e. Ctrl-C. Finally, the ExecStart executes the scrub command on the unescaped path used to instantiate the service but uses -B to avoid immediately returning.

The systemctl(1) command handles interacting with systemd services and units. To start a scrub directly with the systemd service, start the the systemd unit with systemctl start. Here, I start the unit on the root path of the filesystem which is converted by systemd to -.

sudo systemctl start btrfs-scrub@(systemd-escape -p /).service

You can then check the status of the systemd service with systemctl status as follows.

sudo systemctl status btrfs-scrub@(systemd-escape -p /).service
● btrfs-scrub@-.service - Btrfs scrub on /
   Loaded: loaded (/etc/systemd/system/btrfs-scrub@.service; static; vendor preset: enabled)
   Active: inactive (dead)

Timer

Below is the Arch Linux systemd Btrfs scrub timer albeit with a small modification on my part. The systemd.timer runs on the first and fifteenth of every month instead of only once a month. Weekly is also a good option which can be configured by setting OnCalendar to weekly.

/etc/systemd/system/btrfs-scrub@.timer
[Unit]
Description=Btrfs scrub on %f twice per month

[Timer]
OnCalendar=*-*-1,15
AccuracySec=1d
RandomizedDelaySec=1w
Persistent=true

[Install]
WantedBy=timers.target

The Persistent keyword ensures the service runs even if the timer would have fired previously but the system was not available. If you miss a scrub due to your machine being powered off, the scrub will happen the next time you boot up.

Use systemctl enable to activate the timer. Here I set the timer to scrub the root filesystem automatically activate at boot while starting the timer immediately with --now.

sudo systemctl enable --now btrfs-scrub@(systemd-escape -p /).timer
Created symlink /etc/systemd/system/timers.target.wants/btrfs-scrub@-.timer → /etc/systemd/system/btrfs-scrub@.timer.

As with the service, you can check the status of the systemd timer which is shown here.

sudo systemctl status btrfs-scrub@(systemd-escape -p /).timer
● btrfs-scrub@boot.timer - Btrfs scrub on / twice per month
   Loaded: loaded (/etc/systemd/system/btrfs-scrub@.timer; indirect; vendor preset: enabled)

Conclusion

That’s a scrub! Hopefully you’ve got some valuable insight into scrubbing and managing scrubs with Btrfs. Happy scrubbing!