% fortune -ae paul murphy

Rethinking backup

You know how to do data center back-up right? Everybody does: you use backup software to dump anything that changes to tape, store the tapes off-site, and bob's your uncle, duh.

Well, maybe not so fast - first this offsite storage thing has risks of its own, secondly it's time consuming to recover files from backup, and thirdly volumes are starting to get absurd.

For example Dell's Powervault 136T holds up to 72 LTO-3 cartridges, each capable of holding 400GB of data. With one drive this thing costs $22,176 plus $5,999 for each additional drive (up to five more) and $2,390 for a full set of tapes at about $33 each.

Sounds big, but an organization with 1,000 PCs each with 80GB drives could theoretically max out four of these things for every back-up. In practice, of course, that never happens because most of the space on PC disks is either unused or committed to Microsoft and other software more easily re-installed as needed than backed up and recovered.

Still, volumes are increasing, disentanglement is getting riskier, and costs are mounting: an organization with a only a few dozen racks of mission critical servers can easily find itself backing up a half a terabyte a day to save what amounts to a few hundred megabytes of real data -and you can't send half a tape off-site so the minimal ante is $33 a day.

There may be some better options. For example an Apple dual G5 X-serve with a 5.6TB RAID array lists at $20,247. Replacement 400GB disks for this list at $675 and its DVD SuperDrive can write daily backup abstracts (i.e. just the real data) of up to 4.7GB (before compression and encryption) to DVD blanks costing less than a dollar.

Since the Xserve comes with gigabit ethernet on board getting the data to the machine is fast and easy while finding, and recovering, a file from RAID storage is trivial - even if it is 93 days old.

The only issue that bothers me with this is that I have yet to find software that really automates the "just the facts, M'am" step of sorting through the non database stuff to figure out what has to be backed up to the DVD for long term, off site, storage.

On the positive side, the Mac has dual 2.3Ghz G5s and 4GB of RAM, suggesting that when I find the right software, the system won't have any trouble running it in the 22 hours or so available for this step if daily DVDs are required.

The cash savings are obvious but other things may be more important. For example, high quality DVDs outlast tapes, cost less, and require less storage space.

More interestingly, however, the use of low cost RAID storage can make the whole backup and recovery thing easier and more effective. Consider, for example, that Sybase ASE on the Mac loads files produced by backup server on Solaris -meaning that you might as well have the Solaris Backup Server create those files directly on the Mac for safe RAID storage - and once a week have the Mac actually load them to the replicant database so everything can be dumped to text, compressed, encrypted, and sent off-site on a $0.92 DVD.

Paul Murphy wrote and published The Unix Guide to Defenestration. Murphy is a 25-year veteran of the I.T. consulting industry, specializing in Unix and Unix-related management issues.