Will Containers Sink The All-Flash Array?

All-flash arrays are touted as solutions for virtual desktop deployments and big data, but virtual containers and Docker could undercut that.

Jim O'Reilly

April 14, 2015

5 Min Read
NetworkComputing logo in a gray background | NetworkComputing

All-flash arrays (AFAs) are hot in storage today. They provide low-latency access and phenomenal I/O rates, reaching beyond a million IOs per second. Because of this performance, and the relative ease of installation in existing SANs, AFAs have done well in the market.

The use cases for AFA cover a broad spectrum, but two stand out: supporting virtual desktops and big data. However, the rise of virtual containers and Docker could provide an alternative approach for these use cases, potentially putting a damper on demand for all-flash arrays.

Virtual desktops

Most AFA vendors talk up how well they service virtual desktop installations. Here, the industry has been bothered by the boot storm phenomenon, which manifests as a demand for tremendous IO rates at fixed points in time. The root cause is the need to boot up a company’s virtual desktops at the start of day.

Booting a simulated PC housed in a virtual server instance requires all of the OS files and applications to be loaded, followed by personalization files for the specific user. This is exactly like booting a PC, but with hundreds or even thousands of PCs all booting at the same time this overloads most networked storage solutions.

AFA is a good fix for this problem. It can deliver that huge burst of IO that services booting and, with the trend towards cloud-based office computing and SaaS, it seems like “problem solved!”

Hold on, though. With all those desktops being virtually identical, why transfer many copies of the files in the first place? NetApp, for example, has done a stellar job of deduplicating such files, and can boot thousands of virtual instances from a single image copy of the OS/app stack. The problem is that current virtualization solutions still require the filer to deliver un-deduplicated files. In other words, the same files are sent over and over again, taking up just the same network bandwidth as before.

Clearly, a better solution is to only send a single copy of the OS/app stack to each server, but this requires the server to either deduplicate on receipt or to use the OS/app stack in a different way. The latter solution is the virtual container approach, as exemplified by Docker. This inverts the hypervisor world-view of OS and apps each residing in an isolated virtual machine to have just one copy running with each instance living within its own space on that stack.

The result is that we no longer will need to burst many copies of the same OS/app stack to start the day off. Each server does it once and a bonus is that each instance uses much less DRAM, so as many as two or three times the number of instances can be hosted by that server.

Once VDI is fully implemented in the containers model,  it will take away the need for an AFA to achieve boot. Non-boot traffic is relatively low in comparison, and an AFA will certainly give a rapid response to data loading, but it is probably overkill in many environments.

Big data

The other hot use case for AFA is very new. Just recently, we've seen the arrival of petabyte-class AFA solutions. These huge, pricey AFA units contain 512 TB or more of flash and are aimed at the rapidly-growing big-data market. In-memory and GPU-based analytics clouds are very data hungry and the ability of a jumbo AFA to deliver high performance seems to fit the needs well.

Containers won’t change the need for large data flows and the OS/app stack load is small compared with the data sets.  On the surface, this seems to make containers irrelevant to this use case, but a little thought on how data will flow changes this foregone conclusion.

The growth in big data comes from the huge amount of “sensor” data that will be generated, ranging from cameras to bio-medical sensors and cursor or eyeball positioning on web pages. Many IT guys envision all of that data hitting the data center, sort of like having an Amazon-sized river of data flowing through.

In reality, we’ll reverse the thought process we used to distribute web delivery systems (remember Akamai!) and use local processing to trim the stream down to size. A simple example is a surveillance camera in a lobby. It will show mostly the same data all night, so local processing will reduce the data flowing to the central data center by factors of thousands.

The obvious place to host that data reduction is a local cloud, and containers will likely be the solution of choice for that. The reduction of data flow in the central data center has a significant impact on the number of AFA units needed, and may allow even cheaper SSD appliances to be used instead.

So, in both major use cases for AFA, the use of the container model for virtualization will lessen the need for all-flash arrays considerably. This will reduce the total available market for this type of device, which suggests AFA vendors should concentrate on other use cases for business growth. With flash poised for a price reduction that will enable them to reach parity with spinning disks by 2017, the likelihood of the AFA market evaporating is low. But price compression is likely and small SSD-based appliances at much lower prices may be a strong competitor.

About the Author

Jim O'Reilly

President

Jim O'Reilly was Vice President of Engineering at Germane Systems, where he created ruggedized servers and storage for the US submarine fleet. He has also held senior management positions at SGI/Rackable and Verari; was CEO at startups Scalant and CDS; headed operations at PC Brand and Metalithic; and led major divisions of Memorex-Telex and NCR, where his team developed the first SCSI ASIC, now in the Smithsonian. Jim is currently a consultant focused on storage and cloud computing.

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights