Why is backup such a pain? Mainly it’s because we have to interrupt our workflow in order to do it, but for a good number of people it’s about remembering to backup or to be connected to the medium to backup to. Even if you remember to backup and set it up and set it going, your computer is unusable during that timeframe (both because of the system activity and because you don’t want to change files, if possible).
It’s time to take the pain out of backups. I’ll discuss some solutions for both of these problems and I’m sure some of them will fit your workflow. First, however, I’ll talk about the types of backups — especially the kinds you don’t need as a consumer.
Backup Strategies
A backup is a copy. It has nothing to do with being bootable or looking exactly like your files do today, though that’s helpful. In the end, a backup is a copy of information you consider important. A lot of people enter into the world of backups thinking that they need a tool that will make a perfect clone of their hard drive somewhere that they can image back from and get going with minimal fuss.
Completely wrong.
A backup is a copy of data you do not otherwise have a copy of. Why copy programs you can re-download or reinstall? Why backup your OS install? Preferences, settings, configurations and documents, and other generated files are all fair game for backup, but if you’re backing up static files you have another way of getting you’re just wasting time and space.
A proper backup means copying the least amount of files needed to completely bring you back running again. A copy of your home directory is perfectly acceptable, by that definition. A disk image of your computer would be an utter waste of time and space.
Which brings up another topic: methods. There are two primary methods, and some variations on those. The first is a complete backup, which simply translates into “copy every file in the list”. The second is an incremental backup, which copies only files that have changed since the last complete backup. Some other concepts based on these:
- Versioned backups: Several complete backups spanning a long period of time, sometimes years. Financial institutions sometimes use a form of this by performing a year-end backup and moving it offsite to record the state of the system at that time.
- Differential backup: One complete backup and one progressive incremental backup that grows over time until another complete backup is performed.
- Mirror backup: One complete backup that is maintained by copying another complete backup over it.
- Progressive mirror backup: A complete backup that is maintained with an incremental backup over it.
RAID is Not a Backup
RAID is a method of combining many disks into one logical one. There are different approaches to doing this, and all but one of the common methods result in redundancy. Because RAID is advertised as “your data will not be destroyed, even if you lose a disk!” there’re a lot of people out there that want to believe a simple RAID 1 (mirror) or even a RAID 5 (block striping + parity) is a good defense against the ills of data loss. Oh, I wish it were true.
RAID, statistically, increases your chance of seeing a hardware failure. Yes, while that seems odd, think about it a little bit. Let’s say one drive has an MTBF of 100,000 hours (11.4 years). That means that, given a sample group, there will be 100,000 hours of total running time in the sample group before the first drive fails. No, it doesn’t mean one drive lasts 11 years. If you increase the size of the group, you increase the chance that any one drive in that group will fail. If you have one drive, we’ll call that a 1/1000 chance of failure (totally arbitrary). Add in nine more drives with the same chance of failure and your statistic drops to a 1/100 chance of a single-disk failure for the same time frame (again, arbitrary numbers).
RAID is a technology that keeps you running in the case of hardware failure, but it does nothing to prevent the failure of the hardware, and provides zero protection from logical failure.
Physical vs. Logical vs. Human Failure
When a drive goes south and you hear the anus-clenching “chunka-chunka-chunka” sound coming from the drive, that’s a physical failure. When your power goes out during a save and, suddenly, you can’t see files you’ve made in the past year or so, that’s a logical failure. When your child drags your Documents folder to the trash to see Oscar wave at him (raise your hand if you remember that), that’s a human failure. A RAID protects against physical. A backup protects against them all.
Versioned Backups: For People With Money To Burn
Businesses love their versioned backups. If I had enough money to burn to do it right (around $100K) I would certainly do it. It’s a great idea and can be really handy, the ability to go back a week or two for a file. However, if you have 30 GB of data with 1 GB changing a day (not added, just changing), you’re looking at 36 GB of data a week to maintain one complete and six incremental backups. Make that 250 GB and 5 GB changing a day and you’re in the land of RAID or tape for storage of anything more than a couple of weeks.
If you love your data, have the money to do it, and you want to keep copies, this is a great strategy and I recommend it. If you’re a common peasant like me, however, get an external hard drive around 300 GB and keep reading.
Mirror, Mirror
The easiest backup to make with Mac OS X is a mirrored backup. Mac OS X even includes a program for doing this, though it is disguised. In Disk Utility you can “restore” a drive to a saved image. However, what it doesn’t tell you is that neither source nor destination is restricted to being an image, just a volume. Attach a drive and image your boot drive to the external one. Congratulations, you’ve learned how to waste time and space as well as make a good backup. Let’s refine that strategy a little bit.
Mirrors are still useful, but you want to use them properly if you’re going to use them at all. I absolutely love the tool psync for performing mirrored backups. It’s a Perl package that includes a shell utility that performs a synchronization between two file paths. The first sync is more like a complete backup whereas the following ones are progressive mirrors. It’s fast, free, and does the job well. The following command will install it:
$ perl -MCPAN -e 'install MacOSX::File'
You’ll need the developer tools installed for that one. Also, if you’ve never used CPAN before you’re going to get a lot of questions to answer. Just think logically and accept the defaults and you’ll be okay. Pick random mirrors if you have no preferences.
Backup Tools
If you want to pay money for a good Cocoa app to do this for you, there’s some gems I’ve found that, if I wasn’t a psync cult follower, I’d easily shell the money out for. Instead of buying them, I’ll tell you to buy them.
Carbon Copy Cloner
I have mixed feelings about CCC. It’s a great cloner. For making bootable clones of a hard drive, it’s bar-none the winner in the group, to say nothing about NetBoot images or similar. It’s great. It is, however, a cloner and not a backup utility. A whole lot of people use it for backup when that’s just not what it’s good it. It’s good for clones. I realize it has an interface to psync built-in (and lightly hidden) and that’s a plus, it is, but that’s an afterthought and it’s obvious.
If you want to have a bootable mirror, like a RAID, but with the logical and human protections a backup affords over RAID, this is your tool. It will even automate the task with cron for you. If you want anything else, keep reading.
SuperDuper!
SuperDuper is a handy little program for making clones as well. It is, however, a little more in that it allows you to make specific sets of files to include (and has a preset for just the users’ homes) and lets you add in wildcard adds and ignores. It’s pretty decent for what it does and the GUI is clean and usable. For folks wanting “Mac software” this is your choice. If you want power, however…
SyncupX
SyncupX lets you do some wild things with your backup, including using a Finder Saved Search as an include or exclude rule. It’s powerful, easy, quick, and cheap and if you’re looking for a one-click backup, it’s for you. It has an “autosync” mode wherein it will start a sync on open so that you can set iCal events to open the program and it will do the backup then. A bit hackish in my opinion, but it works fine. If you like neat ideas, this is certainly one to look at. Using Saved Searches as include and exclude rules is just brilliant, I think.
Personal Preference
In the end, it comes down to how you want to backup. I’ll go over what I’ve done to give you an idea of a complete solution, however. Keep in mind that a backup is just a copy, so however you go about it (that is, using a tool or not) works as long as you have a copy of what matters.
So, my solution starts with me buying a 10-client copy of Mac OS X Server 10.4 ($500) and setting up mobile homes for my iBooks. This way the computers write to their local hard drive for all my user data and when I get home and get within Airport range of my server it will eventually pick up that fact and start a background sync with my home on the server. No thought required and very little disturbance. Workgroup Manager lets me pick which files to include or exclude in the sync and I’m done. If I installed software elsewhere on my drive I make sure the config files are kept in my home, somehow. All programs I’ve installed are in my home as well, though that would be by my own rule a waste, it’s one I’ll live with. 
On the server I use psync with cron to mirror the data drive and the homes partition of the boot drive to an external 300 GB drive that’s used only for backups. Yes, I have three copies of my home directory, all aged by about a day. It’s damned handy, and I don’t have to think about it. Once a week I make sure it’s happening by checking dates and, reassured that it is, go off and do something stupid that requires restoring from said backup…
Well, if the article needs a verdict, I’ll say that Mac geeks should use psync for most everything they need. Other Mac users should be happy with one of the above tools for backup as well, but if you can afford a small home server (and know how to run one), I’d use mobile homes any day. It’s just cool. 
CodePoet, you’ve done it again.
Excellent article and to the point. The first part of your title is what got my attention in the first place though
(The lazy part).
Your recommendation about using Tiger Server is hands down the ultimate solution but for those that don’t have Tiger Server, psync will definately do the trick.
One thing though, if I may. I’m a bit of a stickler when it comes to security in the sense that I would like my backups to be on an encrypted image (I use filevault all the time). Currently I back up manually simply because I haven’t really found a reliable way of having it automated.
For example, I’d like to back up on the following from my Home Directory:
Documents
Library/Mail
Library/Safari
Library/Preferences
Since I back up to an encrypted image on an external drive, I have manually open that image (needs a password), and then simply drag these files into it.
Am I stuck in doing this manually forever, or is there a better way for me?
Thanks again.
Jarod
Something like this (untested) should work for you.
Thanks; Ill try it out.
In the meantime, I was doing some research and came across PsyncX; it a graphical interface to Psync for those that don’t like (or are uncomfortable with) the command line interfacein Terminal.
http://sourceforge.net/project/showfiles.php?group_id=59994
- Jarod
@Jarod: You could update changed files to a sparse image using a custom Copy Script and Smart Update with SuperDuper!. I’m not sure what the interaction would be with an encrypted image since I’ve never tried that.
Btw, if you’re using keychains you might want to add ~/Library/Keychains to your list of files to back up. Personally, I’d prefer backing up all of ~/Library and possibly omitting specific files and folders (e.g. Cache, Safari/Icons).
. . .
Overall, I trust the data integrity with my OS X backups done using SuperDuper! to be more accurate, thorough, and reliable than with either psync or rsync. Plus you get prompt and personal support for SD! backups solutions if you need it. One limitation for some backup scenarios is that it needs a GUI to run. Otherwise, it’s a utility I can highly recommend, even to certain Mac geeks (like me).
psync is no longer needed. Apple finally added the
-E or --extended-attributesflags, which make rsync HFS-aware, and rsync comes with OS X, so no need to download more cruft.Yes and no. If you’re backing up a subdirectory then rsync works. If you backup a large number of items then you might run into rsync sucking and have it crash on you. I’ve not found an exact number that causes it, but you’ll know when it happens.
Also, ACLs do not have change dates, so files with ACLs (even ones that “explicitly” say inherit) will always be copied over. Most users are not using ACLs (nor have them enabled) but for those that do that’s an additional wait and waste of time if there are a large number of files with them on.
Does psync even handle ACLs at all? One might prefer unneccessary copying of files with ACLs than to have ACLs lost in transmission..
It does not. If you use ACLs, use rsync. If not, psync.
wheres with rysnc:
I’m surprised I don’t read about Apple Backup. Is it too bad to even be considered? Or did you exclude it because you need a .mac subscription for it?
Otherwise, great article
You could use LBackup a wrapper to rsync which supports pre and post scripts. These scripts can handle among other things the mounting and un-mounting of encrypted disk images. LBackup also has the ability to send you and email notifying you of any issues during the backup, or just a copy of the log when ever you specify that it should be sent.
Post new comment