IT Operations Management (ITOM)

Service Health Analyzer: Identifying Problems via Layer Analysis

Service Health Analyzer: Identifying Problems via Layer Analysis


**NOTE - To view the below screenshots more clearly, click on "Article Options" at the top of this article and select "Printer Friendly Page."**


This blog was written by Yohay Golan of the SHA R&D team. 


Challenges of Analyzing Layers

One of the issues IT engineers face today is the need to perform a thorough analysis into the metrics that are causing an anomaly before they can conclude that the specific metrics belong to a common layer.  Service Health Analyzer (SHA) analyzes the abnormal behavior of metrics by their layers and determines if there is a significant problematic layer.


For example, in a non-SHA deployment you need to manually analyze and break down the End User Management (EUM) data collectors of synthetic transaction (BPM) and real-use transaction (RUM) reports such as transaction data to identify the problematic layer.

SHA is able to do this analysis automatically; thereby saving otherwise wasted time for the subject matter expert.


Solution with Service Health Analyzer

The baseline engine of SHA, known as the Real-time Detection Engine (or RAD Engine) calculates a dynamic baseline sleeve for these transactions, per transaction, per location and per layer.  Setting a static threshold per transaction layer is an impossible task in real life since the configuring overhead would just be too high.


SHA is able to analyze abnormal data across all the domains and the transactions by their layers, automatically.  SHA then performs an additional statistical analysis on all the abnormal layers and in some instances, points out the most problematic layer.


By default the following EUM layers are analyzed:

  1. SSL
  2. Connection Time
  3. Network
  4. Server
  5. Download Time


Layers can be added or removed by editing the metrics metadata.


The following screenshot demonstrates a real use case from one of our beta customers, showing the transaction layers that are being analyzed by SHA – Download time and Network time

Transaction Layers Analyzed by SHA.jpg 


Problematic Layer Analysis

SHA performs statistical analysis on the abnormal metrics and identifies a problematic layer if one exists. A problematic layer is indicated in the Anomaly Highlights user interface page of SHA, and in the details of the event that SHA sends to an event console, such as OMi.


This is how the suspect layer appears in the anomaly highlights page:

Layer Suspects.jpg 


How Problematic Layers Are Calculated

  1. The number of abnormal metric samples per layer is calculated.  In the following example we have identified three layers and the number of abnormal metric samples per layer, as follows:
    • Download Time: 5
    • Network: 12
    • SSL time:1
  2. The 90th percentile is calculated on the findings of step 1 (for more details refer to:  In our example the 90th percentile is 10.6.  Note: The percentile used to identify distinct layers can be configured in the infrastructure settings.
  3. If only one layer exists at the 90th percentile, then it is considered as a distinct layer and it is pointed out to the user.

In our example there is only one layer with value that is greater than 10.6 which is the Network time layer with value 12, which means that SHA will indicate in the event description and the Anomaly Highlights that the Network time is a problematic layer. This may help to identify the root cause of the anomaly.


To learn more about the analytical nature of Service Health Analyzer, please see the SHA technical whitepaper.

  • operational intelligence
0 Kudos
About the Author


Product Marketing Manager for HP Application Performance Management suite of software products. Before this role, I worked in the HP StorageWorks Division working as both a Product Marketing Manager overseeing enterprise hardware and software, as well as working as Business Development Manager for the Enterprise Services channel.