Saving Space in Rsnapshot Archives with Hardlinks

The hardlink package is quite handy on Linux systems. It appears to do a nice job saving disk space in my local rsnapshot archives (25 GB reported as “saved” after today’s run on a week of daily snapshots). The stress put on my external USB backup disk wasn’t too extreme either, as reported by Munin’s IO/sec graphs for /dev/sdc:

IO/sec graph for a USB disk, while hardlink was running on a collection of rsnapshot daily archives.

IO/sec graph for a USB disk, while hardlink was running on a collection of rsnapshot daily archives.

That’s a lot of read operations, and then a percentage of them converted back to writes, when hardlink(1) discovers duplicate files and replaces them with hard links. Which is exactly the sort of behavior we’d expect from this sort of thing.

The hardlink invocation that triggered these I/O operation was quite mundane:

# time hardlink -f -m -v *home.*
[output stats ellided]
        1380.243 real   17.525 user     47.975 sys

Having run for a little over 23m it managed to hardlink enough files in my daily laptop backup snapshots to save 25 GB of disk space. I think it was worth the time vs. space trade-off :-)