Why Not Mirror In The Volume Manager?
As I talk to users and vendors about their data protection schemes, I've noticed that many organizations protect their mission-critical applications by synchronously mirroring data from a primary array to a secondary array. They then replicate from the secondary array off-site, or off-campus, asynchronously to a third target. As I think about this architecture, I've started wondering if companies might not be better off mirroring to the primary and secondary arrays directly from their servers, r
December 7, 2010
As I talk to users and vendors about their data protection schemes, I've noticed that many organizations protect their mission-critical applications by synchronously mirroring data from a primary array to a secondary array. They then replicate from the secondary array off-site, or off-campus, asynchronously to a third target. As I think about this architecture, I've started wondering if companies might not be better off mirroring to the primary and secondary arrays directly from their servers, rather than replicating synchronously.
While host volume managers may have been originally designed to support one or more DAS (direct-attached storage)-connected JBODs (just a bunch of disks), the process of using a volume manager with external RAID arrays is nothing new. Back in the dark ages, I had to connect a couple of Windows NT 3.5 servers to an early EMC Symmetrix system via HVD (high-voltage differential) SCSI.
The first thing I learned on this project was the little lesson that plugging the HVD cable into the server's single-ended SCSI port was a bad idea. (It released the magic blue smoke that makes all computers work when contained in those little chip packages.) I then discovered that the Symmetrix presented its RAID-S (an old EMC thing you don't want to know more about) volumes as a 9GB drive, so to make a 45GB NTFS volume I had to aggregate the 9GB drives using Windows volume manager.
Back in the present, I can see some significant advantages to host mirroring over synchronous replication. The most significant is how the system behaves in the event of an array failure. In the replicating example, servers will lose their connections to their data volumes and the administrator will have to manually reconnect them to the "crash consistent" copy on the secondary array. The volume manager, on the other hand, should see an array failure as the failure of one of a mirrored pair of drives and continue limping along.
The other big advantage is, of course, cost. Most array vendors charge a significant amount for their replication software, where most host operating systems include volume managers free of charge.At the low end, some server-based array products support synchronous replication to make up for their single points of failure at the motherboard and/or disk controller. Unfortunately, I've seen that actually using this feature can have a significant negative impact on disk I/O performance. It may just be asking too much from a server motherboard to manage a bunch of storage and keep a doppelganger in sync at all times. Mirroring at the host could provide the single point of failure avoidance with a smaller performance impact.
Clearly, host replication isn't a replacement for synchronous replication. I wouldn't want to mirror a host to arrays in different data centers if there were any significant latency from the server to the more distant array. Even 100 microseconds of additional latency would make a mirrored pair of arrays feel more like a SATA and a SAS mirrored together with performance matching the slower of the pair.
Other than "it's just wrong," why don't more users mirror from the host? Comments, and reasons, welcome.
Read more about:
2010About the Author
You May Also Like