Server Performance Vs. Health

In IT monitoring, performance and health metrics both serve a purpose.

Thomas LaRock

July 21, 2017

2 Min Read
Network Computing logo

When system administrators are given a choice between having a healthy server or one that performs well, they opt for performance. The reason is simple enough: performance pays. While sysadmins may keep their jobs with recovery, we all know that we get paid for performance; health is always secondary.

When the same choices are applied to humans, you get a different perspective. While I may be healthy enough to run a four-minute mile, there is no way my legs could get the job done. You’d be hard pressed to find someone who wouldn’t take health over performance for their own body. Most of us know to not confuse our health with our ability to perform.

So why don’t we place more value on server health over server performance?

Well, as I stated before, the reason is that IT pros are not compensated for the health of a server. But it's also because we don’t measure the health of a server unless we’re using metrics based on performance. As a result, there exists a myriad of tools on the market that jumble together server health and server performance metrics.

The truth is that my server could be healthy, but not able to meet the performance demands of end users. Even if performance demands are met, the server could be on the verge of breaking down beyond repair.

Looking at the difference between the two, we can see where each has a purpose in a hierarchy of monitoring needs. Performance metrics help to measure throughput, and give us an idea how to properly tune a workload or query. Health metrics help to measure resource capacity, and give us an idea if hardware components are on the verge of failure.

Let’s take a common example: a simple database query that is consuming 6% of the overall CPU. Performance metrics will allow for us to tune the query to use less CPU. But health metrics will help us to understand if that 6% CPU usage is causing issues for other processes. In addition, having a baseline of previous query performance history will help us to understand if the 6% CPU usage is typical.

Putting all of this together will help us understand if we need to spend time tuning the workload, or if the time is right to scale up and/or out. Also, when performance and health metrics are combined, you can build actionable alerts that have the potential to eliminate the many hours spent in a reactive mode, fighting fires.

When you combine both health and performance metrics, you will get the right data, at the right time, so you can perform the right actions for your end users.

About the Author(s)

Thomas LaRock

Head Geek, SolarWinds

As a Head Geek for SolarWinds, Thomas works with a variety of customers to help solve problems regarding database performance tuning and virtualization. He has over 15 years of IT experience, holding various roles such as programmer, developer, analyst and database administrator. Thomas joined SolarWinds through the acquisition of Confio Software, where he was a technical evangelist. He also serves on theBoard of Directorsfor theProfessional Association for SQL Server. Thomas is an avid blogger and the author of DBA Survivor: Become a Rock Star DBA, a book designed to give a junior to mid-level DBA a better understanding of what skills are needed in order to survive (and thrive) in their career. He is a Microsoft Certified Master, SQL Server MVP and holds a MS in Mathematics from Washington State University as well as a BA in Mathematics from Merrimack College.

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights