Snapshots And Backups Part Deux

In What Is A Backup?, I compared conventional backups to local snapshots, concluding that restoring data faster using backups is easier when you know where it was last. With conventional backups, an administrator, after cursing under his breath and wishing he could just say no to the CFO, could search the catalog database for smith.xls in Finance and locate the file. Since local snapshots don't include catalogs, it's harder to restore the data that disappeared sometime last summer. But there

Howard Marks

February 17, 2011

3 Min Read
Network Computing logo

In What Is A Backup?, I compared conventional backups to local snapshots, concluding that  restoring data faster using backups is easierwhen you know where it was last. With conventional backups, an administrator, after cursing under his breath and wishing he could just say no to the CFO, could search the catalog database for *smith*.xls in Finance and locate the file. Since local snapshots don't include catalogs, it'sharder to restore the data that disappeared sometime last summer. But there are more issues around using local snapshots for backup.

Part of the reason many storage administrators cringe at the thoughtof snapshots as a backup medium is that they still view the boxes of old backuptapes at Iron Mountain as a long-term data retention solution.  The limited number of snapshots a storagesystem can maintain means they can't satisfy the long-term retention functionthat many backup admins continue to use their backup systems for.

Let's look at the solutions from a couple of the vendors that emphasize snapshots as key to their solutions and responded to Hollis' When Is A Backup Really A Backup? NetApp systems can keep 255 snapshots of any given volume. Since NetApp stores snapshot data in the same RAID set as the primary data (whichmeans on the same class of disks), keeping 255 snapshots will be expensive. NimbleStorage pitches its system as consolidating backup and primary data holding30 to 60 days of backup data compressed on SATA drives.

Frankly, if you have a real archiving system, 60 days ofbackup data should be plenty. The truth is, you rarely restore data older than60 days. You may go on a fishing expeditionlooking for data someone needs now that he or she deleted ayear ago, but archives, with their full-text indexes and deep metadata catalogs,are much better places to fish for data than dusty old backup tapes.

My real problem with using local snapshots as backups is thelocal part. Snapshots stored in the same system as the primary data aredependent on the primary data. If thestorage system fails, you lose not just the primary data but the backups, aswell.

I was going to write about how I thought the comments by EMC's Chuck Hollis that array failures were rare occurrences that users couldessentially ignore were foolish at best and irresponsible at worst. I was goingto look up all sorts of statistics about the likelihood of dual drive failuresin RAID 5 systems and really geek it up.

Then I got an e-mail from a client that last Thursday suffereda dual disk failure on their  primary disk array. I'm spending the next fewdays helping them with the aftermath. Once you have to clean up after something like that, you don't worryabout statistics anymore. It happens, it'shappened to me, and it's going to happen to you. So local snapshots are notenough.

To make snapshots a sufficient backup system, you need toreplicate the data as well as take snapshots. When combining replication and snapshots, I see three places where things couldgo wrong. First, you have to bothreplicate to an independent system in the same data center, or at least oncampus, so you can recover quickly from an array failure without activatingyour whole disaster recovery plan. Thenyou have to replicate to a remote site so you're covered in case of biggerproblems like fires, floods and power failures.

Finally, you have to make sure all three sets of snapshotsare application consistent. It's easy tohave Windows Volume Shadow Copy services or scripts quiesce your databasefor the local snapshot, but you have to take care that the replication system inyour storage arrays maintains that snapshot timing. Often the easiest way is to use point-in-timereplication that sends the snapshot data from array to array rather thanreplicating in real time and creating snapshots on the target arrays.

Once you get to three copies, snapshots can be a reasonablebackup plan. However, with three copies, snapshots can cost as much as moreconventional backups.  

Read more about:

2011

About the Author(s)

Howard Marks

Network Computing Blogger

Howard Marks</strong>&nbsp;is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage systems, networks, management systems and Internet strategies at organizations including American Express, J.P. Morgan, Borden Foods, U.S. Tobacco, BBDO Worldwide, Foxwoods Resort Casino and the State University of New York at Purchase. The testing at DeepStorage Labs is informed by that real world experience.</p><p>He has been a frequent contributor to <em>Network Computing</em>&nbsp;and&nbsp;<em>InformationWeek</em>&nbsp;since 1999 and a speaker at industry conferences including Comnet, PC Expo, Interop and Microsoft's TechEd since 1990. He is the author of&nbsp;<em>Networking Windows</em>&nbsp;and co-author of&nbsp;<em>Windows NT Unleashed</em>&nbsp;(Sams).</p><p>He is co-host, with Ray Lucchesi of the monthly Greybeards on Storage podcast where the voices of experience discuss the latest issues in the storage world with industry leaders.&nbsp; You can find the podcast at: http://www.deepstorage.net/NEW/GBoS

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights