So, What Is A Backup?

Last week EMC blogger Chuck Hollis started a bit of a firestorm when he questioned the marketing position some vendors are taking that local snapshots make traditional backups unnecessary. As we could expect, a series of EMC haters, employees of other vendors and industry analysts jumped in to add fuel to the fire. I think Hollis did ask a valid question: As technologies advance, what is a backup?

Howard Marks

February 15, 2011

3 Min Read
Network Computing logo

Last week EMC blogger Chuck Hollis started a bit of a firestorm when he questioned the marketing position some vendors are taking that local snapshots make traditional backups unnecessary. As we could expect, a series of EMC haters, employees of other vendors and industry analysts jumped in to add fuel to the fire.  I think Hollis did ask a valid question: As technologies advance, what is a backup?

The only reason we back up in the first place is to be able to restore data to the condition it was in before some unfortunate event. At first approximation, lots of things serve as backups. In fact, the most common sources of saved data used to recover from an unfortunate event are the copies of documents Microsoft Office automatically saves periodically. I end up reverting to an auto-saved copy of a document about once a month as my poor PC collapses under the strain of 700 open apps and browser tabs and I have to reboot it.

So if any copy can be a backup, the real question is: What is a sufficient backup, and where do snapshots fit in my backup plan? To be sufficient, my backup architecture has to be able to satisfy restore and recovery requests in a reasonable period of time with as little work by me as possible. It also has to have as little impact on users and application performance as possible. As long as I can satisfy my users' restore requests, I am satisfied to call it a sufficient backup. From where I sit, backups don't have to be created by backup software, and they don't have to be in some special backup format.

That user and application impact area is where many copy-on-write snapshot systems break down. With some snapshot systems like VMware's, keeping multiple snapshots on disk over a period of several days can slow system performance significantly and even crash all the virtual machines using a data store if the data store fills up with snapshot data. So for me to consider snapshots as a backup, they'd better be good snapshots, and that usually means redirect-on-write rather than copy-on-write snapshots.

The majority of restore requests are for single files, or groups of related files, that users have accidentally deleted or corrupted. Since the users are the cause of this data destruction, one of my key criteria for backups is that the backup copy has to be outside the control of ordinary users and their applications. After all, the user who decided that he or she didn't want a file might make sure to delete all the copies of that file he or she can find.  Local snapshots meet the separation criteria and are a great way to get users their original copies back fast. Snapshot systems that integrate with the Windows Volume Copy for Shared Folders and allow users to restore their own files through the Explorer previous version tab or have a Web interface for self-service restores are even better.

Snapshots are great at single-file restores when you know what you're looking for. When a user calls and asks for BigDeal4453.pptx restored from his or her home directory's SALES folder because that file--and not the intended  DeadDeal4453.pptx--was deleted, restoring is easy with a snapshot. Dealing with the call that says, "I had an Excel spreadsheet called 'smithsomthing' or 'somthingsmith' somewhere in the Finance folder structure that we used in June or July and I need it back now," is a bit harder.

When it comes to satisfying users restore requests, snapshots make the most frequent requests easier. However, because each snapshot is a separate point-in-time view of your data store, if you  don't know exactly what you're looking for, or when it last existed, you can spend a lot of time mounting snapshots and scanning them to find the data your users so desperately need.

About the Author(s)

Howard Marks

Network Computing Blogger

Howard Marks</strong>&nbsp;is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage systems, networks, management systems and Internet strategies at organizations including American Express, J.P. Morgan, Borden Foods, U.S. Tobacco, BBDO Worldwide, Foxwoods Resort Casino and the State University of New York at Purchase. The testing at DeepStorage Labs is informed by that real world experience.</p><p>He has been a frequent contributor to <em>Network Computing</em>&nbsp;and&nbsp;<em>InformationWeek</em>&nbsp;since 1999 and a speaker at industry conferences including Comnet, PC Expo, Interop and Microsoft's TechEd since 1990. He is the author of&nbsp;<em>Networking Windows</em>&nbsp;and co-author of&nbsp;<em>Windows NT Unleashed</em>&nbsp;(Sams).</p><p>He is co-host, with Ray Lucchesi of the monthly Greybeards on Storage podcast where the voices of experience discuss the latest issues in the storage world with industry leaders.&nbsp; You can find the podcast at: http://www.deepstorage.net/NEW/GBoS

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights