Cache or Tier? Flash By Any Other Name. . .
Today's flash SSDs are more than just the latest in a long series of storage systems that provide performance at any cost. RAMdisks, head per track disk, short-stroked drives and the like all boosted performance for datasets small enough to fit in the limited space they provided. Vendors are promising that we can use the performance of SSDs for more mainstream applications by moving the "hottest" data into SSDs and leaving the rest behind on capacity oriented drives. Now we just have to agre
March 30, 2010
Today's flash SSDs are more than the latest in a long series of storage systems that provide performance at any cost. RAMdisks, head-per-track disk, short-stroked drives and the like all boosted performance for data sets small enough to fit in the limited space they provided. Vendors are promising that we can use the performance of SSDs for more mainstream applications by moving the "hottest" data into SSDs and leaving the rest behind on capacity-oriented drives. Now we just have to agree on what to call it. Caching, tiering, potato, potahto...
Most of the tiering buzz so far has been about inside the array tiering. Compellent and 3Par are delivering sub-LUN tiering now, and while the current version of EMC's FAST can only re-locate whole LUNs, they've been promising sub-LUN tiering for delivery later this year. It's no surprise that upstart vendors with the wide-striped data blocks spread across all disks architecture were the first ones to do automated tiering. They had a head start as their architecture built LUNs from almost randomly assigned blocks already. They just had to make the system spread blocks across storage with different performance characteristics. Engineers starting with architectures where a LUN is a series of contiguous blocks in a RAID set had more work to do.
The first step to sub-LUN tiering is to start collecting access frequency meta-data on disk blocks. This lets a policy engine periodically identify the blocks that are being accessed most frequently and move them to a faster tier of storage while moving cooler blocks from the fast SSD tier down to a spinning disk tier. The simplest policy would be to migrate blocks that have the highest access rates to faster tiers on a nightly basis.
The problem with this simple policy is that as the mutual fund ads say "past performance doesn't guarantee future results." Just because a block was busy yesterday, when we ran the database defrag, that doesn't mean it will be busy later today, when we run the end of month close. To get the best bang for the buck, vendors will have to keep access meta-data over time so we can write a policy that says move the blocks that were hot the last time we did a data warehouse load to SSD tonight so we can build a new cube tomorrow.
As array vendors were ramping up their tiering story, another group -- Gear6, DataRAM, Avere, StorSpeed and most recently FalconStor -- decided that they could accelerate the process of moving to SSD and capacity-oriented drives, now called Flash and Trash, by implementing a huge cache in SSDs. Just as modern CPUs have several tiers of cache (64KB, 256KB and 8MB for a Xeon 5500) they combine RAM, SSD and in the case of Avere 15KRPM, drives to cache data in a standalone appliance.Avere calls their FXT a "tiered NAS appliance," and tells me I shouldn't call it a cache. NetApp CEO Tom Georgens says tiering is dying to be replaced by caching, so there's a bit of name confusion. As Juliet might say, if she were a storage geek "What's in a name? That which we call a cache; by any other name would still be wicked fast."
The cache approach, as an appliance or in-array as NetApp's PAM or ZFS's readzilla/logzilla combo, seems to have a couple of advantages to me. First, caches can react faster to changes in I/O demand when not having to wait for the policy engine to run. When blocks are promoted, a tiering scheme would release the low speed disk space and later copy the data back from the higher tier as it "cools off," and if the blocks were promoted because of frequent reads, as would be true of some indexes, they would have to be copied back down as they cooled. In a cache system, those I/Os could be avoided.
I can't wait for Flash and Trash, but the technology is still young. I'm sure we'll end up with technologies to merge the low cost/IOP of solid state memory and the low cost/GB of capacity oriented disks that will have aspects of caching and tiering. I'll beat them all into submission in the lab till the best tech wins.
About the Author
You May Also Like