Meltdown, Spectre Create Data Center Dilemma

CPU flaws raise capacity and cost concerns for data center operators.

Uptime Institute

February 9, 2018

4 Min Read
Network Computing logo

Within the first few days of 2018, security researchers at Google, Graz University of Technology, and several other organizations shocked the world with the disclosure of multiple vulnerabilities found in most modern processors. The IT industry in particular was sent into a panic, as we learned that the Meltdown and Spectre flaws plague just about every single computer chip developed in the past two decades.  

Then, there were reports that the quickly released software fixes for Meltdown and Spectre result in massive performance degradation. Early assessments suggest that once patched, some machines take a performance hit as high as a 30%. According to Intel, similar tests conducted since the flaw became public indicate a 2% to 14% decrease in performance. Regardless, both numbers are concerning and carry major implications for the data center.

Many data center owners and operators are in a tough position, wondering just exactly how bad the impact will be and what it will mean for facility utilization and capacity over the longer term.

For the data center market, it’s important to recognize that the flaws’ impact on capacity and demand will be far less than it would have if capacity were more homogeneous or if there was less over-provisioning in individual facilities. Uptime Institute survey responses, along with a variety of other sources, suggest that most servers operate at less than 25% of capacity most of the time, and often much lower. While this appears to be a sign of inefficiency, in many cases data centers use it as a deliberate strategy, especially when it comes to dealing with demand peaks.

data center

datacenterroom.jpg

As a matter of fact, when 451 Research asked operators how they currently deal with variable resource requirements due to randomness, time of day, and/or seasonal demand, 61% of respondents said they “overprovisioned.” While this over-provisioning might help to cushion the impact of the processor vulnerabilities, it won’t solve the issue.

In certain situations, even performance hits at the lower levels could be expensive, forcing a thorough review of capacity or performance. As things are now, fixing the vulnerability with a software patch will degrade performance, while new hardware without the flaws might still be a year or two away. Many operators could be forced into buying new hardware simply because it is more efficient than the systems they currently have – even though they’re still vulnerable.

One global financial services company told Uptime Institute that its projected capacity requirement has “ballooned,” while a cloud provider has said it wants compensation from its server manufacturer to cover expected increased costs.

The overall impact of Meltdown and Spectre on the data center will depend on a variety of different factors. Since the Intel chip flaws were originally revealed, analysts have suggested that certain workloads – such as those with a high input/output requirement – are more likely to suffer worse performance issues. For example, since their systems are most likely to operate at a higher utilization, cloud providers, heavy users of virtualization, and transaction processing applications might be affected the most.  

Enterprises will struggle to pass on any costs and will have to pay for infrastructure to address any IT capacity increases, while cloud providers -- who may be forced to provision fewer virtual servers per underlying processor -- can pass on or absorb extra costs. For colocation facilities, however, there is limited downside, since their customers, both enterprises and cloud service providers, may need more space or power.

Unfortunately, a long-term resolution will require changes to microprocessor designs, which are likely years away. In the meantime, data center operators will consider a range of options, including using cloud services, consolidating,  retrofitting, expanding existing sites, and using colocation or hosted services.

The impact of Meltdown and Spectre is likely to lead a lot of haggling about costs and compensation, especially if there is a clear performance shortfall and new hardware is required. New systems will not be vulnerable, although some redesigns could reduce expected performance gains. The extent of the impact will likely depend on who foots the bill for the servers and supporting infrastructure, where the operator sits in the ecosystem, and whether it is easy to pass on costs.

Andy Lawrence, founding member and executive director of Uptime Institute Research, has built his career focusing on innovative new solutions, emerging technologies and opportunities found at the intersection of IT and infrastructure.

About the Author(s)

Uptime Institute

Uptime Institute is the IT industry’s most trusted and adopted global standard for the proper design, build and operation of data centers -- the backbone of the digital economy. For over 20 years, Uptime Institute has been providing customers with the assurance that their digital infrastructure can perform at a level that is consistent with their business needs, across a wide array of operating conditions. With its data center Tier Standard & Certifications, Management & Operations reviews, Efficient IT Stamp of Approval, and accredited educational curriculum for data center professionals, Uptime Institute helps organizations optimize critical IT assets while managing costs, resources and efficiency. Uptime Institute has become the de facto standard for data center reliability, sustainability and efficiency. Today, thousands of companies rely on Uptime Institute to enable their digital-centric business success.

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights