Dedupe's Next Era - Part Two

The most significant component of this next era will be the move out of backup focused deduplication and up the storage stack to the secondary and primary storage tiers.

George Crump

June 22, 2009

2 Min Read

In my last entry we began to frame up the next era of deduplication. The most significant component of this next era will be the move out of backup focused deduplication and up the storage stack to the secondary and primary storage tiers. Clearly there is already significant work going on in these tiers but in this second era deduplication on secondary and primary storage will become a requirement. Every primary storage system vendor will need to have a solution in this space.

These higher tiers are also where deduplication gets interesting because the data set is not as ideal for it. There simply is less duplicate data and any performance impact will be more noticed. The algorithms will need to be smarter, either more content aware or more granular and of course faster or less resource intensive.
Moving up the storage stack will rekindle the debate of inline vs. post process because of those performance concerns. Can you make the dedupe engine fast enough to dedupe inline on primary storage or does it make more sense to dedupe post process? There is an alternate method not yet commonly used that can perform a parallel dedupe that will allow performance at near inline deduplication speeds, but if under heavy write conditions the dedupe process begins to affect performance the process can shift out of the way and become a post process dedupe until it catches up.
The next era of dedupe when used on primary storage will need to also be able to move that data. Archive can be a huge cost control mechanism for IT administrators, but having the time to implement those processes has been a challenge; building it in to a primary storage dedupe makes a lot of sense. Then extending the capability to offer an optimized migration to the cloud as part of a comprehensive migration archive strategy can be very appealing as we discuss in our article Deduplicating Cloud Storage.

Finally compression has to enter this conversation at some point. Our findings have repeatedly shown that compression, especially on primary storage and possibly on archive storage can deliver as good if not greater efficiencies than deduplication alone. Deduplication to be effective requires redundant data, compression compresses, at varying degress, just about everything.
The next era of dedupe has begun, the suppliers are already jockeying for position and it starts the moment the ink is signed on the Data Domain acquisition.

About the Author(s)

Related Topics

Recent in Infrastructure

Related Topics

Recent in Network Mgmt

Related Topics

Recent in Security

Related Topics

Recent in Enterprise Connectivity

Related Topics

Recent in Wireless

Related Topics

Recent in Careers

Related Topics

Dedupe's Next Era - Part Two

About the Author(s)