Watching the Web Apps

It's critical to identify points of failure in your Web applications. Smart monitoring across multiple tiers can help pin down problems and abolish the blame game. Learn what to expect

November 12, 2004

12 Min Read
NetworkComputing logo in a gray background | NetworkComputing

This alert flood has been stemmed using fault-management correlation--effective as long as you're willing to commit resources to maintain the rules required to make sense of the data deluge. But now performance-management vendors offer another approach, one that bridges deep-dive domain-specific performance metrics with end-service views to offer operations just enough information to perform problem determination. An added benefit is that all three constituencies will stay (we hope) on the same page through the use of consistent data collection and performance presentation.

So far, vendors seem to be successful; 60 percent of readers polled for this article who use performance monitoring tools say they are happy with the value being delivered, and in nearly 20 percent of shops business line managers make use of the products.

Finding a performance-management tool that can meet all these requirements, at a cost that is justifiable, requires more than deciphering the price schedules of each vendor. For example, nearly half of our readers surveyed said allocating or budgeting for enough personnel was the biggest difficulty they faced when deploying performance management. Plus the products have a wide range of options and deployment models that fit differently depending on your needs. But before you start comparing prices, you must understand the performance-management techniques commonly deployed.

nTier This!

At the center of all, or at least most, e-business systems is a multitiered application-delivery cocktail. Often referred to as "nTier," it comprises a mix of devices and functions that sit behind a Web server. The Web tier is divided into subcomponents: load balancer, Web server hardware, operating systems and Web server software.The next tier of application servers includes hardware and operating systems running application servers, such as BEA Systems' WebLogic or IBM's WebSphere. Behind most application servers is a database tier, again with system hardware and operating system, reaching into the back-wall storage tier. But that's not all--you'll need the usual partner processing that grabs data from other sites or shuttles payment authorization, and, of course, everything must be connected by switches, routers, DNS servers and VPN gateways.

Overlaying this processing environment are the so-called "transactions" resulting from users and consumers browsing the Web for whatever service is being delivered. From the Web server point of view, transactions are really just a sequence of Web pages that business owners define as a critical path to commerce, lead generation or information gathering.

Although the Web pages are a desired path, they aren't transactions in the sense that a purchase at the end of a shopping cart sequence is a transaction that credits your account and places the order, updating inventory and writing to the general ledger. These are more critical paths that need watching. The problem is there is no way to provide smart interactions among systems that host all this nTier activity. Therefore, the fact that a Web page provides a list of golf clubs and a purchase price is in no way directly related to the process that the application server initiated to the database server to search available inventory and current price lists.

Thus, it's no surprise that performance-management vendors focus on the transaction and seek to correlate performance metrics. They use two basic kinds of end-to-end monitoring processes: an active robotic-scripted transaction or passive monitoring of real transactions. Robotic transactions are artificially created, continuously scheduled replays of some recorded critical path (transaction). This provides basic information that the function still works and how long it's taking, like running a baseline over and over to see if anything has changed.

This isn't to say that these products are measuring only how long it takes for the entire active transaction to complete; rather, they capture the amount of time on the network, time to do a DNS lookup, time to download the first byte, time to download each page object, time to download last byte and any errors that occur. These products won't tell you which back-end tier is sucking up the most resources, only the response as viewed from the Web server. But active transactions also can show how performance predicted in preproduction deployment fares in the real world as systems and application-development changes tweak the nTier delivery mechanism.

Most performance-monitoring vendors use passive monitors to peel the transaction onion, at server or client, watching and recording what's going on. The passive-monitoring approach also is preferable when measuring what is taking place, transaction-wise, from a load point of view; in contrast, active transactions are predictable. The common example is checking end-to-end performance without any users running transactions, as is often the case after system upgrades at o'late-hundred Sunday night.Passive collection does carry a downside when deployed at the client side, namely managing the minor--but real--footprint that passive agents have on the end-user workstation. This footprint can take the form of Java code or an Active X component, so a Win32 agent isn't always necessary, but even this low-touch deployment is outside the direct control of most IT operations, making it one more organizational hurdle that needs to be cleared. And the hurdle isn't so much technological as political. Desktop deployment folks are understandably skittish about the operations department adding executables to their desktop images.

New to the mix of functionality is what we're calling smart performance management--mining and stitching together back-end nTier performance data. For the most part, this is a furtherance of performance-metric gathering at the application server. Strictly speaking, smart performance management isn't transaction monitoring at all but rather monitoring the effects of transactions on the application server. Many of the products we tested in our review of smart Web performance managers (see "Get Smart," page 42) use IBM's WebSphere's PMI (Performance Management Infrastructure) interface. These products don't care about the internals of a transaction, just that the transaction takes place and how well each components does its job. Performance vendors track, over HTTP or SOAP, the performance statistics provided by this interface.

These performance metrics are average performance buckets that show a range of stats, including but not limited to thread pools, connection pool size, persisted session size and cache size. The wealth of data is as great as it is overwhelming ... great to those managing the application domain, overwhelming to those at the operations or business ends. Wondering why you can't just use the perfmon-type utilities that come with systems and applications servers to monitor performance? You can, but you won't get a centralized data store with reports and views that not only can look across all your systems, but also can go out and get robotic measurements.

More valuable still is the gathering of actual transaction data. This, as mentioned above, is difficult because transactions are not serialized and must traverse multiple nTiers. So though application servers will provide information on the performance of pools and memory, they won't notify you if a transaction is taking too long because, say, a JavaBean is slower today than it was yesterday. We were amazed that at least one vendor has serialized transactions so that the transactions can be followed for performance over the nTier architecture.

This does not mean we could see the big picture--that a set of Web pages is a bunch of browses to buy golf clubs. Rather, it's more focused, showing that a Web page click requesting anti-slice golf clubs performs database lookup and presentation of Magic-U-Wish clubs. The breakdown is by transaction and tier, with connections back to thresholds so alerts on slow bean performance relative to yesterday are possible.This is obviously valuable from an application development and application server domain admin point of view. It also joins resulting database queries, another obvious value. But the process needs to be defined to provide operations triage value because it creates significant amounts of performance data.

Figuring ROI

The main reason to invest in monitors is improved availability and response times. Longer MTBF (mean time between failure) and shorter MTTR (mean time to restore) are enhanced by knitting the customer view of a service, which comes from the end-to-end view, together with the underlying IT infrastructure using transactions as the yarn.

Note that none of the products we tested can predict or alert on failures before customers or the business is impacted. Proactive management is a marketing term. But better coordination between operational groups and application development will speed reaction times and let IT focus and organize its energies: Problem triage isn't done in a vacuum--you need to take every system you're responsible for into account, so the knitted-together view will let organizations spend IT resources more wisely.

So who gets a seat at the decision table when choosing a performance-management system? The performance information outlined above can be used by many different groups. We divide info consumers into the trinity of business owner, operations and system administrators, but most organizations have other groups that also will use the data provided by smart performance products to improve operational efficiency.At the highest, broadest level, business owners and even customers will have interest in IT's service delivery, so they need to be part of defining the deliverable. However, all the performance monitors we tested have good end-to-end availability and response-time delivery, so boiling down complex nTier service nuts and bolts to the requisite happy and sad faces is a given.

At the boundary of IT and the business is the service staff. By definition, they are customer-facing and perform the first, most basic problem triage. Here significant improvements in workflow and time to repair can be realized. These products let service staffers skip operations and go directly to a domain-specific administrator; this insight comes from using the same tool and underlying data collection to view the problem at hand. And service desks have a deep customer awareness that makes their input into a purchase vital.

At the core of most IT shops is operations, tasked with running the nTier systems delivering the business services and constantly monitoring system health. Ops staffers are often the prime mover for the purchase of these systems because they're tasked with mediating problems in the nTier domains. Just as the service desk can help translate business end-to-end service monitors, operations can help admin nTier domains populate the performance product with meaningful statistics. This boiling down to vital performance metrics is required if you expect to see an ROI from any performance tool.

Finally, closest to the problem and performance of nTier systems are your application developers, database administrators, systems administrators, network engineers and storage administrators. Each of these domains will have its own pet tools for diving deep into performance. However, for a performance-management tool to provide significant value, it must address at least some of the metrics native to each of these domains. So, for example, knowing that many products use the PMI interface into WebSphere is crucial when choosing a performance-management tool for a WebSphere shop.

We also recommend you consider replacing existing point tools with a broader performance-management product like those we tested. Obviously, this isn't easy to justify--and the suggestion may be met with outright hostility--but there are a few key facts to remember. First, the more instrumentation, in the form of server resident agents, offered by the product, the more likely the level of granularity will be acceptable. When projected return isn't worth the replacement battle, look for products that can consume domain-specific performance data sources natively. But beware: More involved implementation on the server means more product implementation time overall.Finally, ROI gains can come from application development over life-cycle management of applications. This can further connect operations, domain administration and application development by using the same transactions that proved baseline application functionality in an operational setting.

Bruce Boardman, executive editor of Network Computing, tests and writes about network management and systems. He has 12 years' experience managing networks and distributed computing for a financial service provider. Write to him at [email protected].

If you've ever endured a marathon "What went wrong?" session in the wake of a Web application failure, you're probably willing to pay any price to stop the madness. And you just might get the budget: Web apps are among the most critical--this year, $121 billion will be spent by American consumers online, according to eMarketer--yet monitoring these applications is a complex proposition. Smart Web application performance monitoring products can help. They're not cheap, but in "No More Finger-Pointing," we tell you what you can expect from your investment, how these suites work and how to calculate your real costs.

We also tested six performance monitors: Computer Associates International's Unicenter, Concord Communications' eHealth Suite, Empirix's OneSight, Hewlett-Packard Co.'s OpenView, NetIQ Corp.'s AppManager Suite and ProactiveNet's ProactiveNet, in our Syracuse University and NWC Inc. Real-World Labs®. We determined how well these products gather information about Web-facing applications and devices in a large network, and how clearly they present that information to stakeholders.

We wouldn't throw away any of the products we tested, but a few stood out. HP has, hands down, the best monitoring. It also has tons of knobs to turn and gave us plenty of implementation headaches. Empirix and ProactiveNet stayed close till the end, but Empirix pulled out in front to win our Editor's Choice award on the strength of its intuitive interface. To get your money's worth from these suites they need to be accessible to non-IT folks, and OneSight will endear itself to a range of users.


Read more about:

2004
SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights