Scaling Backup Deduplication With Clustered Storage

In my last entry I looked at scaling single system solutions, and in this entry I'll take a look at scaling backup deduplication via a storage cluster approach delivered by companies like Sepaton and Exagrid. The idea here is to make adding capacity and performance as simple as adding another node to the cluster. Each time you add a node, capacity and performance scales with it. This may be ideal for the enterprise and even for rapid growth mid-tier companies.

George Crump

October 30, 2009

2 Min Read

In my last entry Ilooked at scaling single system solutions, and in this entry I'll take alook at scaling backup deduplication via a storage cluster approachdelivered by companies like Sepaton and Exagrid. The idea here is tomake adding capacity and performance as simple as adding another nodeto the cluster. Each time you add a node, capacity and performancescales with it. This may be ideal for the enterprise and even for rapidgrowth mid-tier companies.

All clustered storage systems are not created equal. As we discussed inour entry "Storage Clusters - Tightly Coupled vs. Loosely Coupled," thekey thing to understand is how these storage clusters deliver on theirmain promise to still deduplicate backups in an efficient manner. Whilereferencing a single target that scales seamlessly in the background isan improvement, you also may want to make sure the deduplication isapplied globally across the cluster. In some cases, the deduplication isonly done on a per node basis and as a result somewhat reduces the level ofdeduplication effectiveness.

Second, some systems require that you point to a specific node in thecluster as opposed to a virtual node or control node. Neither are dealbreakers but worth being aware of. My thinking is that if you wanta clustered storage system, especially in the enterprise, that will grow with you, then you also wantthe deduplication and performance to globally improve as you add nodes.

Finally, as anyone who has managed a cluster of any type, there is animplication of added complexity with a cluster. A storage cluster is nodifferent. Storage vendors have reduced the complexity somewhat bypre-packaging the base configurations of the cluster. If you have thetime to evaluate solutions, make sure you test adding a node to thecluster. Do it yourself, from the point of opening the box all the waythrough adding the node to the cluster and rebalancing storagecapacity. If you don't have time to evaluate solutions, then you should ask hardquestions to make sure you understand exactly how nodes are added andwhat you have to do to make that happen.

As is the case with primary storage, there is no one right answer forall data centers. As a result there is a never ending supply ofoptions. Single unit deduplication systems seem tobenefit from initial simplicity, potentially better energy efficiencyand should have a cost advantage. Multi-Node clusters benefit fromreduction in forklift upgrades and potentially global deduplication.

About the Author

George Crump

Founder

See more from George Crump

Related Topics

Recent in Infrastructure

Related Topics

Recent in Network Mgmt

Related Topics

Recent in Security

Related Topics

Recent in Enterprise Connectivity

Related Topics

Recent in Wireless

Related Topics

Recent in Careers

Related Topics

Scaling Backup Deduplication With Clustered Storage

About the Author

Editor's Choice

Related Topics

Recent in Infrastructure

Related Topics

Recent in Network Mgmt

Related Topics

Recent in Security

Related Topics

Recent in Enterprise Connectivity

Related Topics

Recent in Wireless

Related Topics

Recent in Careers

Related Topics

<span class="ArticleBase-LargeTitle">Scaling Backup Deduplication With Clustered Storage</span>Scaling Backup Deduplication With Clustered Storage

About the Author

Editor's Choice

Scaling Backup Deduplication With Clustered Storage