Parallel Storage Requires Work, Offers Rewards
For those that need peak performance, parallel storage systems may be the answer. But will it find broader appeal?
November 8, 2008
Parallel storage is enjoying some measure of success in highly computational areas like scientific computing and financial services. Parallel storage provides a lot of benefits, but takes a lot of work. It isn't always clear which businesses can justify the labor-intensive programming and regression testing needed to make seamless application interfaces with parallel computing. As a result, it also isn't clear whether the market at large will seriously look at parallel storage and adopt it.
"For many vendors and enterprises, parallel storage is both a carrot and a stick," says Larry Jones, vice president of marketing at parallel storage provider Panasas Inc. "It's a carrot because organizations know that parallel storage will make their applications run faster with the same resources they already have. It's a stick because readying an application for parallel storage takes time, and an organization must ask itself if this is just a one-time solution for one application -- or if the effort will be able to be replicated with other vendors and solutions."
At first blush, inserting code into applications so they can take advantage of parallel storage seems easy. Doing the actual programming may only take a week. However, it is the lengthy regression testing -- making sure that the software modifications perform correctly in every possible scenario and that computational results are repeatable -- that may take another six months.
"Testing every possible computational scenario is critical, because you don't want to find out three years later that there was one scenario you missed that ended up causing the company a $200 million loss," says Arun Taneja, founder of the Taneja Group , the industry research firm .
Some businesses find the lengthy testing process -- and the risks -- worth it."The computer-aided engineering [CAE] software we sell to our customers is used by engineers to simulate the performance of their product design ideas, says Barbara Hutchings, director of strategic partnerships for Ansys Inc., an engineering software provider. "When we provide our customers parallel storage computing that can eliminate I/O bottlenecks, they can rapidly test virtual representations of their products, complete the design process sooner, and get products to market quicker."
Hutchings describes how parallel computing is used by the auto industry, and its impact on storage:
"Let's say that you are measuring the amount of airflow around a vehicle's side mirror, and that this airflow will oscillate depending on what the car is doing. For every airflow simulation you do, you need to save the results -- and if you're saving only one result per simulation, I/O time is not a bottleneck and you don't have to be concerned about parallel storage. However, our clients want to 'checkpoint' their simulations, a process where they might save 10 different 'snapshot' results during a simulation. In the case of the oscillating flow behind the mirror, data must be saved at hundreds of points because the predicted simulation state varies in time. This dramatically increases the amount of I/Os, and the application becomes very compute-intensive."
Many of Ansys's automotive clients run on large clusters, 48 or 96 CPUs being typical. "In this environment, I/O might represent 10 percent to 15 percent of overall turnaround time and is a significant factor," notes Hutchings. "If the company is running an even larger cluster [100 CPUs], I/O time can grow to 50 percent."
By supporting parallel file handling, Hutchings says, Ansys is able to deliver software to its customers that reduces I/O by an order of magnitude. "For time-varying simulations on large clusters, this means that a simulation can be completed in roughly half the time that it takes in a non-parallel storage environment -- a major impact, since we are talking about many hours."Hutchings says it's hard to quantify a precise return on investment for support of parallel file systems, but that it is a key part of satisfying customer needs for ever more capable high-performance computing.
Ansys customers are not alone. A brokerage company using parallel computing can potentially place a trade and secure a two-second advantage that can result in a $100 million profit. "When I was first looking at parallel computing as a business opportunity, I would advise my staff to find commercial applications that are really 'computational' under the hood," says analyst Taneja. "Scientific simulations [as in oil and gas, scientific laboratories, or the auto industry] are certainly compute-intensive, but so is Wall Street."
Panasas sees the current scientific and financial applications arena as a multibillion-dollar niche in 2009, although some industry analyst estimates are lower. What Panasas, industry analysts, and Panasas business partners all are betting on is that at some point, the currently proprietary parallel storage technology, sometimes called Parallel NFS, will become part of the NFS 4.1 standard and be positioned to enter the commercial market."The NFS 4.1 standard has been completed at the [IETF] committee level, and has been reviewed by area directors," says Panasas's Jones. "Changes have been sent back, and we are now in the last phase of what has been an iterative process."
Jones expects the NFS 4.1 standard to be completed no later than 2009. Although Sun Microsystems Inc. (Nasdaq: JAVA), with its Solaris operating system, and IBM Corp. (NYSE: IBM), with AIX, are already incorporating parallel storage into their products, it is the Internet Engineering Task Force (IETF) standardization of NFS 4.1 that Jones sees as the potential "turning point" for entry into a larger commercial market dominated by mainstream Linux users. Jones estimated that "pieces" of a preliminary NFS 4.1 version will be released in early 2009 and that a full version of 4.1 will be available in late 2009. He predicts enterprise IT departments may see Red Hat Inc. (Nasdaq: RHAT) and SuSE Inc. Linux products with parallel storage in 2010.
"Parallel I/O is a great way to significantly increase performance, which is already needed in every area that has to do with I/O-intensive computations and with just about any application that manipulates rich media," says Mike Karp, senior analyst for research firm Enterprise Management Associates . "As throughput increases in the commercial segment, parallel will move increasingly into the mainstream."0
You May Also Like