Scaling Backup Deduplication With Clustered Storage
In my last entry I looked at scaling single system solutions, and in this entry I'll take a look at scaling backup deduplication via a storage cluster approach delivered by companies like Sepaton and Exagrid. The idea here is to make adding capacity and performance as simple as adding another node to the cluster. Each time you add a node, capacity and performance scales with it. This may be ideal for the enterprise and even for rapid growth mid-tier companies.
October 30, 2009
In my last entry Ilooked at scaling single system solutions, and in this entry I'll take alook at scaling backup deduplication via a storage cluster approachdelivered by companies like Sepaton and Exagrid. The idea here is tomake adding capacity and performance as simple as adding another nodeto the cluster. Each time you add a node, capacity and performancescales with it. This may be ideal for the enterprise and even for rapidgrowth mid-tier companies.
All clustered storage systems are not created equal. As we discussed inour entry "Storage Clusters - Tightly Coupled vs. Loosely Coupled," thekey thing to understand is how these storage clusters deliver on theirmain promise to still deduplicate backups in an efficient manner. Whilereferencing a single target that scales seamlessly in the background isan improvement, you also may want to make sure the deduplication isapplied globally across the cluster. In some cases, the deduplication isonly done on a per node basis and as a result somewhat reduces the level ofdeduplication effectiveness.
Second, some systems require that you point to a specific node in thecluster as opposed to a virtual node or control node. Neither are dealbreakers but worth being aware of. My thinking is that if you wanta clustered storage system, especially in the enterprise, that will grow with you, then you also wantthe deduplication and performance to globally improve as you add nodes.
Finally, as anyone who has managed a cluster of any type, there is animplication of added complexity with a cluster. A storage cluster is nodifferent. Storage vendors have reduced the complexity somewhat bypre-packaging the base configurations of the cluster. If you have thetime to evaluate solutions, make sure you test adding a node to thecluster. Do it yourself, from the point of opening the box all the waythrough adding the node to the cluster and rebalancing storagecapacity. If you don't have time to evaluate solutions, then you should ask hardquestions to make sure you understand exactly how nodes are added andwhat you have to do to make that happen.
As is the case with primary storage, there is no one right answer forall data centers. As a result there is a never ending supply ofoptions. Single unit deduplication systems seem tobenefit from initial simplicity, potentially better energy efficiencyand should have a cost advantage. Multi-Node clusters benefit fromreduction in forklift upgrades and potentially global deduplication.
Read more about:
2009About the Author
You May Also Like