Do We Need Consolidated Deduplication?

In my last entry, "Can We Get to a Single Point of Deduplication?", I looked at who had the capabilities for a single point of deduplication, essentially consolidating the deduplication engine so that only one supplier and one dedupe engine...

George Crump

July 31, 2009

5 Min Read
NetworkComputing logo in a gray background | NetworkComputing

In my last entry, "Can We Get to a Single Point of Deduplication?", I looked at who had the capabilities for a single point of deduplication, essentially consolidating the deduplication engine so that only one supplier and one dedupe engine manages all the optimized data regardless of storage tier; primary, secondary, archive and backup. Another question is, do you really need it?


As I pointed out, there would be some theoretical gains by consolidating the deduplication process. The more data that goes through the engine the better chance it has seen that data before and can be optimized. Also, if you have a different deduplication engine at each tier, that means the data has to be expanded or re-hydrated as you move that data between the tiers of storage. 


The reality is that we will not have a consolidated deduplication platform for a while, so hopefully you don't feel that you need it right away. If you have deduplication at all, based on a recent survey we did, less than 20% of data centers had implemented ANY form of deduplication, so you are likely to be using deduplication as part of the backup process. To some extent, then, you have consolidated deduplication by only implementing it on one tier. 


To have a multi-silo deduplication problem, you have to be using the technology on more than one tier. It seems that users are equally split on what that next silo to be optimized is; do you optimize primary or optimize secondary/archive? I think that primary will be the next major target more so than archive. Most archives already have it, so it is really not a decision point, and most archives are weighed as strongly on their retention and scalability capabilities as much as they are on deduplicating data. 


I also think that as the disk archive solutions become faster or the backup deduplication appliances become more archive capable, the two tiers, from a deduplication standpoint, may merge into one and as a result may be the first area of consolidation of deduplication engines.

 

Primary storage will be the next big battleground for optimization for one simple reason: the payoff here is the greatest. If you deliver a technology that can store more data in less space on the most expensive tier of storage, people are going to pay attention to you. As we indicate in our Primary Storage Optimization screencast, there are many differing ways to get there: compression, deduplication and archiving, to name a few. Compression may be more important than deduplication in primary storage. You can compress everything, you can only deduplicate redundant data. 


At that point we will have two or three tiers of storage being deduplicated by potentially three different technologies. Movement between the tiers is going to be a challenge because as I stated earlier it may require a rehydration of the data if the technology used differs. It seems that for vendors the next tier over is relatively easy to incorporate. Primary storage deduplication companies can develop a movement engine to move data to the next tier and backup deduplication companies can move up to the archive tier. 


To complete the loop so to speak -- primary, archive and backup -- is yet to be done and as deduplication becomes a more broadly used technology it ideally will be available from a single platform as opposed to a silo for each. In reality I would not hold my breath, we have been promised a lot of consolidation of different storage technologies throughout the years and not much has really been consolidated.


The bottom line is yes, we need a single point of deduplication but it is unlikely that we will get there, be prepared to manage separate silos of deduplication. 

Read more about:

2009
SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights