Can Cloud Snapshots Replace Backup?

I'm satisfied that snapshots and replication in conventional storage systems can serve the same function as more traditional backup schemes. While snapshots make satisfying the most common restore requests easy the limitations of most storage system's snapshot mechanism leave most organizations using snapshots as a supplement to, not a replacement for, backup copies. Does the cloud change the snapshot as backup calculus? Some cloud storage vendors say it does.

Howard Marks

February 24, 2011

3 Min Read
Network Computing logo

I'm satisfied that snapshots and replication in conventionalstorage systems can serve the same function as more traditional backupschemes.  While snapshots make satisfyingthe most common restore requests easy, the limitations of the snapshot mechanism in most storage systemsleaves most organizations using snapshots as a supplement to,not a replacement for, backup copies. Does the cloud change the snapshot asbackup calculus?  Some cloud storagevendors say it does.

The primary function of a cloud storage gateway, like those fromvendors such as Nasuni, Cirtas and StorSimple, is to let users take advantageof cloud storage without rewriting their applications. Without a gateway your applicationshave to put and get data objects from the cloud storage provider you've chosenthrough that vendor's particular API.  Your users want to store their data on a NASor file server via CIFS or NFS, Also, server applications like Exchange andSharePoint need traditional block interfaces. The cloud storage gateway maps these common protocols onto the cloudobject store and provides a local cache to make your applications run faster.

The cool part is that the gateways also provide snapshots.Since cloud storage providers will be glad to sell you as much space as youwant, the gateway vendors have designed their systems to let you have anunlimited number of snapshots of your volume or file system. 

That's a big step up from the 16-255snapshots most disk systems let you keep online and, since the snapshots existout in the cloud but your gateway has a couple of TB of cache for the workingset of data that you and your applications actually access on a day to daybasis, those snapshots won't have any impact on performance.

A redundant pair of caching gateways is reliable enough thatI would consider them and the snapshot data they hold to satisfy my need for alocal copy.  Since all your data is inthe cloud, data is "backed up" in close to real time and if you need to recoverat a remote location you just need to fire up a gateway at the remotesite.  The new gateway will startpopulating it's cache as your users access their data and, despite the fact thatit's restoring across an Internet link, your users are accessing their mostcritical data faster than if you restored a whole server from a conventionalbackup, as the gateway restores data in small chunks as needed.

Now don't get me wrong, cloud snapshots aren't perfect.  If you decide to keep 5000 snapshots you'llhave to pay Amazon or Nirvanix every month to keep all the data in thosesnapshots online.  Like other snapshots,snaps in the cloud don't come with extensive metadata so a keyword search mightbe a slow and painful experience as the whole data set has to get dragged downfrom the cloud.

Using the cloud as your primary storage also puts you at themercy of your cloud storage provider.  Ifthey lose your data, raise their rates, go belly up or otherwise cause youproblems, retrieving your data and getting set up elsewhere will be a painfulprocess.

Now I'm pretty sure that top notch providers are better at datamanagement than most organizations but there is some risk here. The truth is that cloud storage provider SLAs, asimportant as they may be, can't make youwhole after a cloud service provider loses your data any more than Kodak sendingyou a new roll of film made you whole after they lost the pictures of yourhoneymoon in Bora Bora or the kid's first steps.

I'm looking forward to the day when cloud gateways can storetheir data to multiple could back ends to reduce this risk. Even better wouldbe if they could write to a local object store like a Caringo CAStor or EMC Atmos and a publiccloud provider. That would give me fast access for eDiscovery and real timeoffsite backup.

About the Author(s)

Howard Marks

Network Computing Blogger

Howard Marks</strong>&nbsp;is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage systems, networks, management systems and Internet strategies at organizations including American Express, J.P. Morgan, Borden Foods, U.S. Tobacco, BBDO Worldwide, Foxwoods Resort Casino and the State University of New York at Purchase. The testing at DeepStorage Labs is informed by that real world experience.</p><p>He has been a frequent contributor to <em>Network Computing</em>&nbsp;and&nbsp;<em>InformationWeek</em>&nbsp;since 1999 and a speaker at industry conferences including Comnet, PC Expo, Interop and Microsoft's TechEd since 1990. He is the author of&nbsp;<em>Networking Windows</em>&nbsp;and co-author of&nbsp;<em>Windows NT Unleashed</em>&nbsp;(Sams).</p><p>He is co-host, with Ray Lucchesi of the monthly Greybeards on Storage podcast where the voices of experience discuss the latest issues in the storage world with industry leaders.&nbsp; You can find the podcast at: http://www.deepstorage.net/NEW/GBoS

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights