Rsnapshot
From ConShell
Introduction
rsnapshot is a filesystem snapshot utility for making backups of local and remote systems.
Schedules
My favorite schedule is the 2-4 schedule, which yields full backup sets for today, yesterday and weeklies from as far back as a month.
rsnapshot.conf should contain...
daily 2 weekly 4
Cronjobs
cat /etc/cron.d/rsnapshot 0 23 * * * root /usr/bin/rsnapshot daily 30 22 * * 6 root /usr/bin/rsnapshot weekly #This sends a copy of any errors that occurred via e-mail to root #Useful to identify when things have broken 21 6 * * * root cat /var/log/rsnapshot.log | grep ERROR
Another graceful way to handle this from cron is to us log levels 2 & 4 then any cron'd run will be quiet unless there are problems.
Compression?
Backup disk full? Rsnapshot isn't very compatible with compressed backups because it destroys the efficiency of the hard-links used in the archive sets. But there are a couple of work-arounds.
The first is that SOME of the files in the (2nd & beyond) archive sets can be compressed, those that aren't in a linked set.
find daily.1 -links 1 -size +1M ! -name "*.bz2" -print | grep -v \.svn | xargs pbzip2 -v
Revert
find daily.1 -links 1 -name "*.bz2" -print | xargs pbzip2 -d -v
Only run that against the 2nd archive set (e.g hourly.1, daily.1 or weekly.1) of the shortest in your configuration.
Another approach is to use a filesystem that supports native/transparent compression like zfs or btrfs
Q & A
How much data can rsnapshot backup and for how long?
A. This can be calculated as the size of the original backup set + the "churn", where churn is the size of all the files added or removed during the range from newest to oldest backup set. If the original+churn is larger than the total storage available for the backup, you will run out of disk space.
How can I get the most usage out of my available disk space?
A. Follow the best practices (below). Reduce the size of the original set i.e. by using excludes. Use the compression trick described above (only compress the 2nd set of the shortest interval), or use a filesystem that supports transparent compression.
What are some best practices for using Rsnapshot
A. First of all, when backing-up logfiles you should make sure your logrotate.d/* configurations specify 'dateext' and 'compress' options so that rotated logfiles have the date stamp and are precompressed before rsnapshot gets to them. This saves rsnapshot from saving multiple copies of the same data since the filenames won't be changing (0 to 1, 1 to2 etc). Another way is to use exclude=foo options to minimize excessive backups.
Rsnapshot seems to fail in mysterious ways...what are some ways to mitigate these problems?
Such as like old backup sets hanging around, configuration pickiness (space for tab problem) and other types of nonsense.
A. There are a number of things that can be done. First of all, having a configuration checker run (via cron) one or more times per day can help prevent a broken configuration from going unnoticed for very long. Examining the contents of the daily.0 (or hourly.0 if you go hourly) using ls -al daily.0/ can help to expose stale folder structures (perhaps from a host that no longer exists). Finally, bump the verbosity on the logging and make sure the output from the rsnapshot cronjobs is getting sent to you via e-mail. Review this regularly for any sign of problems. Alternatively, see the cronjob example above for a command that will weed out the errors from the log.
How can I prevent stale / orphaned data from leeching my disk space?
A. Over time (as hosts come and go) you may find stale data is left in daily.0 never getting purged out as it should. The way I deal with this is to write cronjob in /etc/cron.daily/breadcrumb on each system you back up. Have it touch a .crumb file in the root of each normal partition which you backup. This should be complemented with a cronjob on the Rsnapshot server which finds the stale breadcrumbs and thus notifies you. Example...
sudo find /path/to/backup/daily.0 -maxdepth 4 -name .crumb -mtime +1 -ls
How can I preserve hard-links in my backups?
A. By default the rsync_short_args only uses -a which does not include -H preserve hard links. So, specify like so...
rsync_short_args -aH
del.icio.us
digg
Facebook
Posterous
reddit