IT Operations Management (ITOM)
cancel

What’s new in the Operations Manager i Management Pack for Infrastructure 2.0

What’s new in the Operations Manager i Management Pack for Infrastructure 2.0

GaneshKumar_A

Hewlett Packard Enterprise has just released the OMi Management Pack for Infrastructure 2.0 (Operations Management i Management Pack for Infrastructure) and it includes a number of valuable advancements to improve your incident management. In this blog article, we’ll walk through the new monitoring features step-by-step. But first, let’s review the management pack basics.

The OMi MP for Infrastructure works with Operations Manager i (OMi) (core component of the Operations Bridge Suite) and enables you to monitor the various systems operating in a data center environment. The MP for Infrastructure includes rules that analyze and categorize the events occurring in the systems and report the health status of the systems. It also includes Management Templates for monitoring the availability, health, and performance of individual systems, clusters, and virtual nodes. Management Templates consist of a wide range of aspects which enable monitoring the system components.

Management Templates can be deployed by administrators for monitoring the systems in any environment. They are easily customizable by subject matter experts (SMEs) and developers to suit different monitoring requirements.

Now, let’s get to the exciting part and talk about what’s new with the OMi Management Pack for Infrastructure – 2.0.

  1. Process Monitoring: Infra MP 2.0 allows you to monitor all the required processes and process groups on the UNIX systems. You can monitor multiple processes using a single policy. Two new policies – Sys_ProcessMonitor and Sys_ProcessMonitorConfig – have been added to the Key System Services Availability Aspect. This Aspect monitors the key processes that run in the background to support the different tasks required for the operating system or application. These allow you to specify all the process and process groups in the Sys_ProcessMonitorConfig policy before deploying the Aspect. The Sys_ProcessMonitor policy monitors all the process in the process groups. In the Sys_ProcessMonitorConfig policy, you can specify the process group and location of the procmon.cfg configuration file. You can specify the process group and location of the procmon.cfg configuration file while deploying the Aspect as optional parameters.

 

Alerts are generated whenever the processes defined in the configuration file either do not run as expected or the processes are out of limits during the specified time of the day and day of the week. After the Aspect is deployed, if the procmon.cfg file is available in the location specified in Sys_ProcessMonitorConfig policy, then the file is overwritten. If the file is not available, then a new file is created in the designated location.prospect monitoring aspect alert.png

 

In the above screenshot, you see an alert generated for process monitoring aspect. The process monitor policy is configured to monitor 0 instance of opcmsga, opcmona processes and 1 instance of opcmsgi process. On the agent, you can see opcmsga and opcmona process are running and opcmsga process is not running. Both conditions are not expected from the policy rules. For opcmsga and opcmona processes, 1 instance is running and expected is 0. For opcmsga process, 0 instance is running and expected is 1. Hence an event was generated.

2. Adaptive Thresholding: In addition to existing Adaptive Threshold policies, the MP for Infrastructure 2.0 adaptive threshold concept is used to determine optimal threshold values by using the historical records for performance characteristics and usage patterns of infrastructure resources. The concept is used instead of using fixed threshold values that are specified in the policies. The policies that use adaptive thresholds calculate the average value of the metric during the last hour. The average value is then compared to the data collected in the previous four weeks during the same hour or day interval. If there is a significant difference in the value, the policy generates an alert.

 Constant threshold values set in the policies are ideal for a specific scenario, but not for all scenarios. It is necessary to change the threshold values according to the type of environment for improved performance of the infrastructure resources. Distributed system environments generally follow predictable trends over time. Adaptive threshold helps to automatically calculate the threshold values according to available performance data for previous weeks.

 Agent base line metrics.png

The above screenshot contains the base line metrics collected by the Agent.

Adaptive threshold policy.png

 

In the above screenshot, you see the alert generated for such an adaptive threshold policy based on the baseline metric collected. The policy calculates the average value (4.73), then it compares it to the data collected in the previous four weeks during the same hour or day interval.

3. Configuration Change Monitoring: The CCI Monitor policy monitors system configuration-related files (which are not supposed to be modified without notifying the admin), Windows registry settings, and command output for changes. All configuration changes that you monitor can be added to the config file policy, or to the configuration file ccilist.cfg located at:

On Windows:

%OvDataDir%\conf\ccimon\configuration

On UNIX:

/var/opt/OV/conf/ccimon/configuration

 

The Change Configuration Monitor policy monitors the following changes on the system:

  • Software installed, removed or modified
  • Patches/service packs/updates installed
  • Changes to Kernel parameters
  • Boot configuration
  • Registry key for Windows
  • Kernel image file
  • All user accounts
  • System service configuration
  • Shared directories, NFS or CIFS (samba) mounts added, modified or removed
  • System environment variables

 

CCI and Desired State Monitoring:

In CCI Monitoring, during the first polling interval, a backup of all the files specified in the ccilist.cfg file is created. From the consecutive polling, a comparison is performed between current version and the backup version of files. Alerts are generated if modifications are identified. Then, the backup files are overwritten with a fresh backup. Comparison is always between the current version and a recent backup version of files.

 

In Desired state Monitoring, a gold file (with extension .gold) is created for every single file that will be monitored and will be available in the same directory as the file. A gold file is a backup or reference file that remains unchanged.

 

For example, let us consider that you want to monitor the mtab file located in the /etc directory. Take a backup of this file and save it as mtab.gold in the /etc directory. This is your reference file. To monitor the mtab file add the following to the configuration file:

/etc/mtab==/etc/mtab.gold,file,Os,,major.

 

After deploying the Aspect, a check is performed to verify if desired state monitoring is defined in the ccilist.cfg configuration file. A comparison is performed for the files, windows registry settings, and command outputs specified in the configuration file with the corresponding gold file. Alerts are generated whenever a difference is identified between the two files.

 

4. Real Time Alerts: The Real Time Alerts policy monitors congestions and bottleneck conditions for system resources like the CPU, memory, network, and disk. The policy performs real-time collection of matric values. CPU bottleneck monitoring is based on global CPU utilization and load average (Run Queue Length). Memory bottleneck monitoring is based on memory utilization, free memory available, and memory swap out rate. Filesystem monitoring is based on space utilization level for the busiest filesystem on the node. Network monitoring is based on Packet collision rate, packet error rate, and outbound queue length. The Real Time Configuration policy defines the threshold for these parameters. During a threshold breach, alert messages notify the system administrator quickly and reduce the downtime in the production environment.

Note: RTMA license to be enabled on the HPE Operations Agent node for RealTimeAlerts policy to fetch real time data.

CPU Bottleneck probability.png

As a post-deploy action, this policy runs a Perl script (advisor.pl) and then sends the advisor output to the adv.out file. The policies are run on a node to read data from the adv.out file and send alerts to the Operations Management i IT Event management console.

5. Security Policies for Log File monitoring: The MP for Infrastructure 2.0 includes enhanced monitoring of bad logins and system log for AIX, HPUX, Solaris, and Linux. It also enables monitoring of SNMP logs for Solaris.Secuirty policies for log file monitoring.png

 In the above screenshot, an event was generated because of failed login from a user.

 

6. Energy Data Collection: The Start/Stop Collection is a tool used to collect metrics from systems where HPE Operations Agent version 12.x is running. This tool measures the energy utilization of physical machines where multiple virtual machines are installed. You can use these metrics and draw a graph in Performance Dashboard. The Start/Stop Collection tool is supported only on Windows and Linux platforms and functions only when HP Integrated Lights-Out (iLO) is installed on the physical server. start stop collection tool.png

 

 Execution Result.png

7. HPE Operations Agent Self-Monitoring: The MP for Infrastructure 2.0 contains Agent policies that are included to monitor Operations Agent processes which are previously shipped with Operations Manager Windows/Linux.

 8. HPE Operations Agent Tools: The MP for Infrastructure 2.0 contains Agent tools that are included to perform start/stop/status operations on Agent which are previously shipped with Operations Manager Windows/Linux

Operations Agent tool.png

 

9. Operations Agent Performance data collection: The MP for Infrastructure 2.0 also includes Performance Collection Components in terms of Aspects containing policies that help you to collect performance metrics from the node. You can store the collected information in the data store. The performance data collector helps you collect system performance metrics at regular intervals. You can configure the type of data to be collected and the collection interval. The performance alarm feature of the Performance Collection Component enables you to generate events based on predefined conditions.

A Note on X86 Virtualization Technology Evolution:

OMi MP for Infrastructure 2.00 will not support monitoring of X86 Virtualization. HPE recommends that you use the monitoring-only edition of HPE Cloud Optimizer for monitoring x86 Virtualization technologies such as VMware vSphere, KVM, and Xen. Management Templates, Aspects, and policies related to these technologies are not available from OMi MP for Infrastructure 2.00 version.

For x86 Virtualization technologies, you can continue to use the virtualization component of the OMi MP for Infrastructure (versions earlier to 2.00) till the obsolescence of HP Operations Agent 11.1x. However, for non-x86 Virtualization technologies (HPVM, AIX, Solaris), you can continue to use the OMi MP for Infrastructure.

Some of the key benefits that you get with the technology evolution to HPE Cloud Optimizer are:

  • Improved scalability and interoperability.
  • Improved coverage of metrics.

 

Explore the full capabilities of the Operations Bridge Suite and technology integrations:

MP for Infrastructure 2.0

 

 

  • operations bridge
About the Author

GaneshKumar_A

Comments
HPE Blogger

Hello,

thanks for your post blog. Very interesting. I have tested the MP Infra 2.0 and some questions:

  • FIlesystem discovery does discover /proc /cdrom and /sys (which are no real filesystems). It also generates messages for that (e.g. /proc is 100% full etc). Will this be improved?
  • Is InfraMP able to set the "System Availability" and "System Performance" KPI?

Kind regards,
Harald

//Add this to "OnDomLoad" event