IT Operations Management (ITOM)
cancel

Get agent health at a glance with the Health Dashboard for infrastructure monitoring

Get agent health at a glance with the Health Dashboard for infrastructure monitoring

GirishMatti

What is one of the most common situations for IT environment owners? It seems that one of these common situations occurs when some HP Operations agent nodes are not sending messages at the configured time in the policy or are not working as expected.

 

The reasons for these issues may be any of the issues below (and more):

  • Operations Agent runtime issues
  • System Resource and Performance issues
  • Configuration issues
  • Software (including Operations Agent) upgrade issues

 

So what’s the impact of nodes not communicating properly?

 

Agent failure might affect the monitoring of critical services. In enterprise environments it takes time to detect agent nodes that may have failed, troubleshoot and then fix the issues. If this occurs—the real reason why HP Operations Manager infrastructure monitoring agent is installed on the node (for application/systems monitoring) starts again!

 

 

How do you fix this?


For every customer there is a need to reduce MTTR (Mean Time to Repair) and increase MTBF (Mean Time Between Failures).

 

 

To fix such Operations agent health issues:

 

Healthview -1.png

 To fix these issues, ideally we need an intuitive Central Dashboard to view the overall health of the Operations Manager agents.

 

This Dashboard should have the ability to drill down into each agent node, then into sub-agent health to quickly identify, troubleshoot and fix identified monitoring issues. It should as an add-on also provide meaningful logs and events, with enough detail to take an action to fix the problem wherever possible.

 

 

Figure 1: How to identify and fix agent problems

 

 

Here comes the Health Dashboard with the new HP Operations agent 12.0

 

With the latest release of HP Operations Agent version 12.0, the above formula for identifying agent health issues has been implemented and a Health Monitor Dashboard is now available!

 


Healthview -2.png

 

Start with the dashboard, then see which nodes need attention and drilldown to the node.

 

 

 

 

 

 

Figure 2 - Agent Health Dashboard  

 

 

 

Healthview -5.png 

Figure 3 - Node and Process views

 

Then you further drilldown to the sub-agent and then voila there is the reason for the agent failure!

 

By using the Health View Dashboard, you can detect:

  • Operations Agent Runtime issues like sub-agent hangs and aborts, provide meaningful info into why it happens.
  • Security-related issues likes certificates installed, license settings
  • System Resource and Performance issues like the current resource utilization of agent processes (CPU, memory, disk, threads, semaphores/handles, agent disk space utilization, growth patterns etc.)
  • Overall System Resource Usage and availability
  • Agent Runtime Configuration issues like policy Runtime State (Enabled/Disabled, Collection state, Last Run, Missed intervals, etc.)
  • Runtime Configuration (Variables) state
  • Errors in agent logs

 

What’s more?

 

The Health View is designed for both the Operations as well as Performance Personas. It can coexist with existing agent health monitoring solutions (SelfMon, HBP, HP OM Agent Health Check, HP OM Health Monitoring component etc.) and use the existing communication channels (BBC) without the need for opening an additional port. 

One more thing, you can install the Dashboard on an OM server or a non-OM server as well (it is more suitable for the OMi environment).

 

Please find more details on installing and configuring the Health Dashboard in Operations Agent 12.00 Health View Guide

Here is the HP Operations agent 12.00 Release notes and there is also a zip of all the manuals, you can find it here.

 

Have a look at the post Exploring the potential of HP Operations Agent version 12 to know more about what's new with the HP Operations Agent version 12.00.

This is the first of a series of blog posts on Health View. Keep your eyes on the Business Service Management blog for these future posts.

 

 

You can also learn more about the Operations Agent and the new Health Dashboard here.

 

  • operations bridge
About the Author

GirishMatti

Comments
Honored Contributor.

Hi Girish,

I undertand agent 12.0 is required to be installed on the Health  Agent view Server , But as a traget node it is supported only on 12.0 or lower versions also ( 11.14 ) 

 

Regards,

Ashish