You Can Eliminate Backups

One of the questions in backup circles lately asks if the entire backup process can be eliminated now that most storage systems have some combination of snapshots, deduplication, compression and replication? The idea sounds good. Let primary storage take care of itself and eliminate one of the more troublesome processes in the data center, but there are some holes in the strategy. The question is, can primary storage accomplish everything we expect from a backup target?

George Crump

October 4, 2010

3 Min Read
Network Computing logo

One of the questions in backup circles lately asks if the entire backup process can be eliminated now that most storage systems have some combination of snapshots, deduplication, compression and replication? The idea sounds good. Let primary storage take care of itself and eliminate one of the more troublesome processes in the data center, but there are some holes in the strategy. The question is, can primary storage accomplish everything we expect from a backup target?

For instance, what we want from a backup system is the ability to roll back to a specific point in time in case a system fails or a data set becomes corrupted. This means being able to copy versions of a good data set at certain times, often nightly, and storing those copies on a separate storage device where capacity is less expensive than on primary storage.

Using a combination of snapshots, deduplication, compression and replication is a cost-effective way of storing redundant copies. Many primary storage systems support a high number of snapshots and/or unlimited copies of data by leveraging deduplication. Most can then have that data replicated to a remote site so you are covered for a single site disaster. With these features deployed, we now have point-in-time local recovery and total system recovery in case of a disaster covered, but there are some potential drawbacks.

The obvious hole in using primary storage is the risk of a system failure at the primary location. This can come from a controller bug or a multiple drive failure that your current RAID level can't recover from. If the primary storage system is down, and it is your sole source for backup copies, then you're going to need to recover from the remote copy. The key concern now is going to be time. How long will it take to get the all the data back over the wide area connection and recovered on the repaired local system? It might be easier to ship it to the DR unit than to drag all the data across the wire.

The other problem is the way all this redundant data is referenced. With deduplication, when a copy of data is made or a snapshot is taken, the snapshot and/or deduplication services build a reference table to map the write requests for redundant copies of data back to the original instead of actually writing them. This reference table is a database. There is potential for corruption or failure. Your primary data and all the point in time copies of that data are dependent on this table to reassemble themselves. If that table gets corrupted, then your point-in-time copies and even your primary data may not be readable.The chances of either of the above scenarios happening is relatively small, but that's why we do backups, to cover the odd event that causes us to lose everything. You could replicate to a second unit locally and then a third in DR, and that is essentially the same thing as doing a backup. We are also assuming that the system failure or corruption would be instantly apparent.

The deduplication or snapshot engine could produce a silent error that does not appear right away. Data is either being written in a corrupted fashion or the deduplication tables are finding false positives, so everything appears to be working correctly and you may not know you have a problem until months later. Suddenly you go to read a file, and it is either missing or corrupted. Most deduplication processes have self-check code to help prevent this sort of thing from occurring, but it is something to be aware of.

Counting on snapshots, deduplication and replication as your primary and even secondary recovery options is perfectly acceptable. The 99.999% of the times it works and provides you with rapid recovery of critical information. Just be aware there are risks involved in not storing your data on a separate platform.

Read more about:

2010

About the Author(s)

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights