I recently worked with a customer, who we will call Eva, and she has complex availability reporting needs on all CI types, including Business Services. If this sounds familiar and you struggle with reporting like Eva, keep reading to find out how we met Eva’s needs and made her successful.
Eva’s availability reporting needs included:
Availability of all CI types: business services, applications, transactions, databases, servers and right down to processes
Availability computation for each based on different rules
Different propagation rules to calculate the availability of aggregated CI types like business services and CI collections
Availability computed at different time intervals, viz. daily, weekly, monthly, ...
CI grouping and report filtering based on business services, and CI Collections.
The first approach we looked at, using Service Health Reporter (SHR) on top of the other OOB content packs (CPs): Systems Management for server availability, BPM/RUM CPs for application availability, and other application specific content packs like Oracle DB, SQL Server DB, etc. And then we built custom content on top, with potential customizations within each OOB CP. However, significant customizations to OOB content meant dealing with maintainability issues when SHR releases newer versions of the content packs.
Then we analyzed an alternate approach, the BSM Service Level Management (SLM) route. It was instantly clear that SLM was the right place to define the complex availability KPIs—it offers the flexibility to define your own KPIs, each with potentially its own propagation rule, and a lot more. The only flip side at that time was that there wasn’t an OOB integration in SHR with SLM. In other words, we needed to develop a new content pack. I knew this was the right approach, it offered solution flexibility and adaptability to their changing availability reporting needs, with easier future maintainability. It also opened up the possibility to make the solution more holistic, instead of simply reporting just availability (like the ability to also report on the performance KPIs). And so I authored the SLM content pack in SHR, details about this are further below. You can also download it here.
First a few words on SLM: Service Level Management ensures that agreed services are delivered when and where they are supposed to be delivered. With SLM, a service level agreement (SLA) is first defined, and the agreement is then measured over time. SLM calculates key performance indicator (KPI) and health indicator (HI) values from the received availability and performance data. It then compares them with the pre-defined service level objectives, and then records the status at the pre-defined time intervals.
The SHR content pack for BSM SLM (SLM CP) provides collection and reporting on:
The SLAs, and its associated KPIs
All the SLA components defined within BSM SLM, mainly the calendars, tracking periods, SLM objectives and offerings
The RTSM View below, which is used by SHR to collect the SLA and its components, gives a good pictorial representation of all SLA-related CIs that are collected by the SLM CP.
A note on the RTSM view shown above, a lot of the SLM content pack design complexity revolved around getting this view definition right, mainly the model, the cardinalities, and the relationships (including the multiple relationships that constitute one virtual compound relationship definition). For example, see the definition of the relation “Virtual Compound 1” in the above view in the RTSM Modeling Studio.
A good practice for any new content pack development in SHR is to design the RTSM view (even a whiteboard design should do) for the CIs and the topology you plan to bring from RTSM in the early stages, it will provide you a much better understanding of how to model the content in SHR. For best practices on SHR content development in general see the blog Content development and extensibility in HP Service Health Reporter from my colleague.
In addition to the SLA components collected from the view above, SLM CP also needs to collect the CIs defined under the SLA. Owing to the nuance of how the defined SLAs are exposed in RTSM, whereby a new TQL is dynamically created for each new SLA and its CIs, the SLM CP first collects the list of SLAs and its CI_IDs, and generates a collection policy based on the dynamic TQL names. The collection from this dynamic TQLs provides the SLAs and its CIs. See the SLM CP guide for the design details.
And last but not the least, the SLM CP collects the SLA KPIs and its status at all the defined tracking periods from the BSM profile DB (the KPI names are although queried from the BSM management DB). Note that the SLM CP only collects KPI status for the “closed” tracking periods (closed periods are the ones that have already completed and finalized, e.g. daily KPI status for Dec 10, 2014 will be available only on Dec 11, 2014). BSM SLM also keeps a running status for “open” tracking periods in a different set of tables, and the SLM CP doesn’t collect those. If you have a need for reporting on open tracking periods then you should try the SLM reports available within BSM SLM.
OOB the SLM CP currently includes two reports, mainly the SLAs and its KPI status across different tracking periods (see below screenshots of the reports). I plan to develop more reports and discuss them here in the near future. However, all the necessary data and model needed to develop more advanced reports are already part of this version of the content pack. Because of this, you should be able to design your own advanced reports on top of this version of the SLM CP.
You can download the SHR SLM content pack, its user guide, including the design details, and its development source files. You can also drop your comments/inputs/defects below, they are more than welcome. Please find other useful blogs and resources on the HP Service Health Reporter Live Network page.