Dedupe Everywhere Changes Economics

With the new versions of Backup Exec and NetBackup due Monday, Symantec's finally joined CommVault, IBM, Atempo and most of their other backup software competitors by integrating data deduplication into their mainline backup software. Now that you can dedupe data your media server of your choice we have to ask the question "Where is the most cost effective place to dedupe backup data?"

Howard Marks

February 1, 2010

2 Min Read
NetworkComputing logo in a gray background | NetworkComputing

With the new versions of Backup Exec and NetBackup due Monday, Symantec's finally joined CommVault, IBM, Atempo and most of their other backup software competitors by integrating data deduplication into their mainline backup software. Now that you can dedupe data your media server of your choice we have to ask the question "Where is the most cost effective place to dedupe backup data?"

For the past few years the conventional wisdom was to point your backup software at a deduping disk target like those from Data Domain, Exagrid, Quantum, Sepaton and FalconStor. Compared to replacing your backup software with a source deduping product like Asigra or Avamar integrating a backup target, especially a VTL, into your existing backup process is relatively painless and they can suck up data at a prodigious rate (up to 5TB/hr for a Data Domain DD880).

Deduping targets are probably still the best solution for enterprise data centers where users around the world keep servers busy around the clock minimizing both backup windows and host CPU cycles available for backup tasks like source deduplication. Smaller shops typically run a single shift leaving a 12 hour nightly backup window when servers, even virtualized servers, have cycles to spare.  The questions then become: How much could a shop with a few terabytes to backup save? And is media center dedupe fast enough?

An entry level deduping disk target like a Quantum DXi 3500 or Exagrid EX4000, both of which have 4TB of usable disk space, costs $40-50,000 which could be a significant chunk of an SME's total hardware budget for the year. Integrating the appliance into the network and updating the 20-100 backup jobs a company this size would have should take less than a day.

If that company was a Backup Exec user, as many are, they could add 4TB of disk to their backup server and add the $1995 data deduplication option instead. A Dell MD1000 SAS JBOD with PERC 6/I RAID controller and 15 500GB drives is about $7200. Even if they decided to have their favorite consultant or VAR build a new Xeon 5500 server, with beaucoup memory as Symantec's dedupe needs, about 1GB of memory for each TB of disk and other Backup Exec features like granular recovery of Exchange mailboxes are memory intensive themselves, and total cost will be on the order of half that of a deduping target. Other vendors including CommVault charge for dedupe on a capacity basis but not enough to change the basic economics.The other question is how fast do backup apps actually dedupe data? The proof of the pudding is in the eating, and we're looking forward to getting deduping backup software from Symantec and others in the lab so we can generate some real-world metrics and best practice recommendations.

Note: We at DeepStorage.net do work for Symantec occasionally.

Read more about:

2010

About the Author

Howard Marks

Network Computing Blogger

Howard Marks</strong>&nbsp;is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage systems, networks, management systems and Internet strategies at organizations including American Express, J.P. Morgan, Borden Foods, U.S. Tobacco, BBDO Worldwide, Foxwoods Resort Casino and the State University of New York at Purchase. The testing at DeepStorage Labs is informed by that real world experience.</p><p>He has been a frequent contributor to <em>Network Computing</em>&nbsp;and&nbsp;<em>InformationWeek</em>&nbsp;since 1999 and a speaker at industry conferences including Comnet, PC Expo, Interop and Microsoft's TechEd since 1990. He is the author of&nbsp;<em>Networking Windows</em>&nbsp;and co-author of&nbsp;<em>Windows NT Unleashed</em>&nbsp;(Sams).</p><p>He is co-host, with Ray Lucchesi of the monthly Greybeards on Storage podcast where the voices of experience discuss the latest issues in the storage world with industry leaders.&nbsp; You can find the podcast at: http://www.deepstorage.net/NEW/GBoS

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights