So I setup my new server so reevaluated my rsnapshot setup. I see now there is 3 new (well new to me at least) options to set.
link_dest 1
sync_first 1
use_lazy_deletes 1
Link Dest is just basically telling it to Rsnapshot use an Rsync feature that’s better for sync with hard links.
Lazy Delete is just a nicer way of doing all the syncing and then deleting the old stuff.
Sync First is the one that seems newer to me. Basically you do rsnapshot sync to create a new backup to the .sync folder. Then calling any other rsnapshot <hourly | daily | weekly> just rotates (moves and deletes).
So I’ve read the big advantage is if the sync fails, then you don’t wind up rotating in a bunch of bad backups into your mix, but what I tested today is you still lose your higher up backups. Meaning if you use Hourly as your lowest your daily, weekly, and monthly backups can rotate off the cliff if you continue to have undetected failed backups. This issue doesn’t happen for the hourly in this case, because the hourly is called immediately after a sync completes successfully. So if the sync fails, the hourly doesn’t rotate.
So I was just wondering if anyone had found a nice way to work around this.
Below is an example of what I mean. This is considering the sync is failing so the second hourly (2 of 2, hourly.1) doesn’t exist because it’s already been moved to daily.0 and a new hourly.1 isn’t created because of the failed sync.
If daily continues to run, then eventually it will shift all the backups off the cliff and there will be no dailys left.
Hi Justec
We use r1soft for backups, when we had issues on high loading using IW normal backup runs (we do have some high usage/volume sites), and this immediately reduced loading to less then 2.
Have you looked at http://www.rfxn.com/appdocs/README.irsync, which we have installed on our test server, and it does appear to work very well.
I think this would overcome your issue of backups disappearing and gives you more options.
I hope that helps
Many thanks
John
The point-in-time backups which are restorable as full backups are stored in
the .SNAPS directory, these are rotated off for deletion based on the max age
value in conf.irsync using find’s mtime option piped to rm.
A common misconception is that deleting a hard link will delete the source data
but this is not the case. When an rm is run on hardlink pointers, the number of
links is checked and the data is only deleted when links reaches 0.
To demonstrate how the backups work on the storage server we can look at the below
storage layout details to see how the snapshots and full image get populated.
The full image synced data with size and # of files:
ls freedom.lan.full/
etc home local root var mysqldump mysqlhotcopy
du -sh freedom.lan.full/
1.9G freedom.lan.full
find freedom.lan.full | wc -l
17911
Now lets assume we have run three iterations of irsync to date, the snapshots
path would look something like this:
17819 2010-02-19.202026
18416 2010-02-20.202718
18227 2010-02-21.191503
So what does this all translate into? as we can see our full backup is 1.9G in
size with 17.9k files then subsequent backups have synced in changed data only
with the 2010-02-19.202026 image having 12M of changed data and an offset of 92
fewer files. Although we capture the changed data in the 02-19 snap, we also
have all our original data as indicated by the file counts but without having
the space overhead of duplicating the data.
This is done by hard linking to the full image for any unchanged data, on
subsequent irsync runs when new changed data is synced in, it breaks the hard
links in the snapshots which leave behind a copy of the original data in its
previous state. This method of point-in-time incremental backups allows for
the easy retention of changed data, with minimal space usage while having a
logical backup layout that is fully restorable from each individual snapshot
and compatible with any utility as hard links are treated just like regular
files and directories.
Yeah, this this is a backup for outside of the normal SiteWorx, just backing up the raw files and databases.
I’ll have to check those others out, but I really like rsnapshot. I really like that it uses rsync and makes basically incremental backups, but by using hard links every backup time appears to be full for that specific date. Works very similar to Time Machine on Apple.
If I stick with rsnapshot, I’m thinking of just creating my own front-end script to call the daily, weekly, and monthly and just check that the base rotation from the hourly (hourly.1 to daily.0), daily (daily.7 to weekly.0) and monthly (weekly.4 to monthly.0) are set before starting rotation. So instead of the default behaviour of moving the daily’s for example like this…
daily.7 --> Delete
daily.6 --> daily.7
daily.5 --> daily.6
daily.4 --> daily.5
daily.3 --> daily.4
daily.2 --> daily.3
daily.1 --> daily.2
daily.0 --> daily.1 (fails because daily.0 doesn’t exist, thus making daily.0 and daily.1 not exist after this run)
The script would just check do something like this:
if [ -d "snapshots/daily.0" ]; then
rsnapshot daily
fi
Not sure I know this phrase? Right now I just grab the raw MySQL files from the var/lib/mysql. You mean taking a mysqldump of some sort? Or something else entirely?
Yeah, good idea. I think I could dump them all into individual sql files in a specific folder and then backup that folder as part of the rsnapshot as well.
When just using the iworx command line backup for database only (which basically does a mysql dump anyway), how come this seems to still backup the entire siteworx account?
~iworx/bin/backup.pex --domains all \
--backup-options db \
--output-dir /var/db-backups/ \
--quiet \
--email admin@webhost.com
I’m guessing because it’s doing a structure backup and a DB backup. Is there anyway to just do a siteworx DB backup, or should I just do a direct dump? I just wanted to avoid having to put the mysql root password in a script.
Also, it seemed to ignore the --output-dir directory, just put it in the root folder /
#! /bin/bash
BACKUP_DIR="/var/lib/mysql_backups"
databases=`/usr/bin/mysql -e "SHOW DATABASES;" | grep -Ev "(Database|information_schema|performance_schema)"`
for db in $databases; do
/usr/bin/mysqldump --force --opt --databases $db | gzip > "$BACKUP_DIR/$db.gz"
done
So basically this will create individual backup files of every database on the system and then save them to the backup directory gzip’d. Then these backup files will get saved into my rsnapshot rotation when backing up the var directory so I have multiple snapshots of the databases.