rsnapshot with sync_first

So I setup my new server so reevaluated my rsnapshot setup. I see now there is 3 new (well new to me at least) options to set.

link_dest 1
sync_first 1
use_lazy_deletes 1

Link Dest is just basically telling it to Rsnapshot use an Rsync feature that’s better for sync with hard links.
Lazy Delete is just a nicer way of doing all the syncing and then deleting the old stuff.

Sync First is the one that seems newer to me. Basically you do rsnapshot sync to create a new backup to the .sync folder. Then calling any other rsnapshot <hourly | daily | weekly> just rotates (moves and deletes).
So I’ve read the big advantage is if the sync fails, then you don’t wind up rotating in a bunch of bad backups into your mix, but what I tested today is you still lose your higher up backups. Meaning if you use Hourly as your lowest your daily, weekly, and monthly backups can rotate off the cliff if you continue to have undetected failed backups. This issue doesn’t happen for the hourly in this case, because the hourly is called immediately after a sync completes successfully. So if the sync fails, the hourly doesn’t rotate.

So I was just wondering if anyone had found a nice way to work around this.

Below is an example of what I mean. This is considering the sync is failing so the second hourly (2 of 2, hourly.1) doesn’t exist because it’s already been moved to daily.0 and a new hourly.1 isn’t created because of the failed sync.
If daily continues to run, then eventually it will shift all the backups off the cliff and there will be no dailys left.

Thanks for any advice!


[root@server1 snapshots]# rsnapshot daily
echo 17035 > /var/run/rsnapshot.pid
mv /backup/snapshots/daily.3/ /backup/snapshots/daily.4/
mv /backup/snapshots/daily.2/ /backup/snapshots/daily.3/
mv /backup/snapshots/daily.1/ /backup/snapshots/daily.2/
mv /backup/snapshots/daily.0/ /backup/snapshots/daily.1/
mv /backup/snapshots/hourly.1/ /backup/snapshots/daily.0/
rm -f /var/run/rsnapshot.pid
[root@server1 snapshots]# rsnapshot daily
echo 17209 > /var/run/rsnapshot.pid
mv /backup/snapshots/daily.4/ /backup/snapshots/daily.5/
mv /backup/snapshots/daily.3/ /backup/snapshots/daily.4/
mv /backup/snapshots/daily.2/ /backup/snapshots/daily.3/
mv /backup/snapshots/daily.1/ /backup/snapshots/daily.2/
mv /backup/snapshots/daily.0/ /backup/snapshots/daily.1/
/backup/snapshots/hourly.1 not present (yet), nothing to copy
rm -f /var/run/rsnapshot.pid
[root@server1 snapshots]# rsnapshot daily
echo 17701 > /var/run/rsnapshot.pid
mv /backup/snapshots/daily.5/ /backup/snapshots/daily.6/
mv /backup/snapshots/daily.4/ /backup/snapshots/daily.5/
mv /backup/snapshots/daily.3/ /backup/snapshots/daily.4/
mv /backup/snapshots/daily.2/ /backup/snapshots/daily.3/
mv /backup/snapshots/daily.1/ /backup/snapshots/daily.2/
/backup/snapshots/hourly.1 not present (yet), nothing to copy
rm -f /var/run/rsnapshot.pid
[root@server1 snapshots]# rsnapshot daily
echo 18490 > /var/run/rsnapshot.pid
mv /backup/snapshots/daily.6/ /backup/snapshots/_delete.18490/
mv /backup/snapshots/daily.5/ /backup/snapshots/daily.6/
mv /backup/snapshots/daily.4/ /backup/snapshots/daily.5/
mv /backup/snapshots/daily.3/ /backup/snapshots/daily.4/
mv /backup/snapshots/daily.2/ /backup/snapshots/daily.3/
/backup/snapshots/hourly.1 not present (yet), nothing to copy
rm -f /var/run/rsnapshot.pid
/bin/rm -rf /backup/snapshots/_delete.18490

Hi Justec
We use r1soft for backups, when we had issues on high loading using IW normal backup runs (we do have some high usage/volume sites), and this immediately reduced loading to less then 2.
Have you looked at http://www.rfxn.com/appdocs/README.irsync, which we have installed on our test server, and it does appear to work very well.
I think this would overcome your issue of backups disappearing and gives you more options.
I hope that helps
Many thanks
John
The point-in-time backups which are restorable as full backups are stored in
the .SNAPS directory, these are rotated off for deletion based on the max age
value in conf.irsync using find’s mtime option piped to rm.
A common misconception is that deleting a hard link will delete the source data
but this is not the case. When an rm is run on hardlink pointers, the number of
links is checked and the data is only deleted when links reaches 0.
To demonstrate how the backups work on the storage server we can look at the below
storage layout details to see how the snapshots and full image get populated.
The full image synced data with size and # of files:

ls freedom.lan.full/

etc home local root var mysqldump mysqlhotcopy

du -sh freedom.lan.full/

1.9G freedom.lan.full

find freedom.lan.full | wc -l

17911
Now lets assume we have run three iterations of irsync to date, the snapshots
path would look something like this:

ls freedom.lan.snaps/

2010-02-19.202026 2010-02-20.202718 2010-02-21.191503

ls 2010-02-21.191503/

etc home local root var mysqldump mysqlhotcopy

du -shc *

12M 2010-02-19.202026
133M 2010-02-20.202718
275M 2010-02-21.191503

for i in ls; do find $i | wc -l; done

17819 2010-02-19.202026
18416 2010-02-20.202718
18227 2010-02-21.191503
So what does this all translate into? as we can see our full backup is 1.9G in
size with 17.9k files then subsequent backups have synced in changed data only
with the 2010-02-19.202026 image having 12M of changed data and an offset of 92
fewer files. Although we capture the changed data in the 02-19 snap, we also
have all our original data as indicated by the file counts but without having
the space overhead of duplicating the data.
This is done by hard linking to the full image for any unchanged data, on
subsequent irsync runs when new changed data is synced in, it breaks the hard
links in the snapshots which leave behind a copy of the original data in its
previous state. This method of point-in-time incremental backups allows for
the easy retention of changed data, with minimal space usage while having a
logical backup layout that is fully restorable from each individual snapshot
and compatible with any utility as hard links are treated just like regular
files and directories.

Yeah, this this is a backup for outside of the normal SiteWorx, just backing up the raw files and databases.

I’ll have to check those others out, but I really like rsnapshot. I really like that it uses rsync and makes basically incremental backups, but by using hard links every backup time appears to be full for that specific date. Works very similar to Time Machine on Apple.

If I stick with rsnapshot, I’m thinking of just creating my own front-end script to call the daily, weekly, and monthly and just check that the base rotation from the hourly (hourly.1 to daily.0), daily (daily.7 to weekly.0) and monthly (weekly.4 to monthly.0) are set before starting rotation. So instead of the default behaviour of moving the daily’s for example like this…

daily.7 --> Delete
daily.6 --> daily.7
daily.5 --> daily.6
daily.4 --> daily.5
daily.3 --> daily.4
daily.2 --> daily.3
daily.1 --> daily.2
daily.0 --> daily.1 (fails because daily.0 doesn’t exist, thus making daily.0 and daily.1 not exist after this run)

The script would just check do something like this:


if [ -d "snapshots/daily.0" ]; then
  rsnapshot daily
fi

Hi justin

That sounds good.

I wouldn’t mind testing it on our test server if you create it and you don’t mind

Are you going to incorporate MySQL hot dump

Many thanks

John

Not sure I know this phrase? Right now I just grab the raw MySQL files from the var/lib/mysql. You mean taking a mysqldump of some sort? Or something else entirely?

Hi justin

Yes, sorry, making a dump as opposed to copying db.

I mention this because if db in use, it might be locked or could corrupt.

Many thanks

John

Yeah, good idea. I think I could dump them all into individual sql files in a specific folder and then backup that folder as part of the rsnapshot as well.

When just using the iworx command line backup for database only (which basically does a mysql dump anyway), how come this seems to still backup the entire siteworx account?


~iworx/bin/backup.pex --domains all \
                      --backup-options db \
                      --output-dir /var/db-backups/ \
                      --quiet \
                      --email admin@webhost.com

I’m guessing because it’s doing a structure backup and a DB backup. Is there anyway to just do a siteworx DB backup, or should I just do a direct dump? I just wanted to avoid having to put the mysql root password in a script.

Also, it seemed to ignore the --output-dir directory, just put it in the root folder /

Hi justec

I personally would do a direct dump.

I believe this is how r1soft works on MySQL

Many thanks

John

Yeah, I looked up a way to do this without having to put the password. You created a chmod 600 file called .my.cnf in the root home directory.

In there you put:

[mysqldump]
user=root
password=your_root_password

Then you can run mysqldump on the command line without having to enter any user or password:

[root@server1 ~]# mysqldump database_name > database_name-dump

Thanks for the advice.

For anyone interested I also added [mysql] to the .my.cnf file and then created this script I found on the Internet.

.my.cnf


[mysql]
user=root
password=your_root_password

[mysqldump]
user=root
password=your_root_password

mysqlBackupScript


#! /bin/bash

BACKUP_DIR="/var/lib/mysql_backups"

databases=`/usr/bin/mysql -e "SHOW DATABASES;" | grep -Ev "(Database|information_schema|performance_schema)"`

for db in $databases; do
  /usr/bin/mysqldump --force --opt --databases $db | gzip > "$BACKUP_DIR/$db.gz"
done

So basically this will create individual backup files of every database on the system and then save them to the backup directory gzip’d. Then these backup files will get saved into my rsnapshot rotation when backing up the var directory so I have multiple snapshots of the databases.