Guest post by Marina Lyan, System Architect, HP Operations Analytics.
You can now leverage HPs big data technologies from the HAVEn big data platform to preserve all your IT data. Your Vertica database is storing thousands of metrics, logs and events for hundreds of thousands of systems. Think of all of the potential within your data.
This potential sounds great, but now you have a new challenge. Having all that data in one place is useless if you are not able to locate the data that you are looking for. You need some help finding the golden nugget within the data mine.
What is PQL?
HP Operations Analytics comes with the tools to help you easily find the information you need. It uses a very intuitive approach to search and navigate in the world of Big Data with operations analysis. This approach is implemented in Operations Analytics, with PQL (Phrased Query Language) and its query engine, users can simply enter what they are looking for, or what they are interested in.
“What’s going on with my disk IO?” will be [disk io] in PQL
“Are our systems in Florida doing well?” will be [Host: "*florida.usa.com"Focus On:system status]
“What about the database cluster serving our sales reporting service in America?” will be [Service:“America Sales”Drill to: "Database cluster"]
“What is the performance of my Server?” will be [Host: “server123.mydomain.company.com” Focus on:performance utilization]
Legend: green – tags, blue – PQL syntax, Purple – data/user defined topology entities
So if you look for example for performance utilization information on a specific host you just need to type ‘Host: "myd-vm02341.hpswlabs.adapps.hp.com" Focus on: performance utilization’, and get all the information:
PQL is very straight forward for the user. No need to know which data sources have the information I need, no need to define which tables to look at. OPSA automatically locates the data that’s relevant to the user’s search.
Behind the scenes we find all the performance related (cpu, memory…) metrics for the specified host across all the data in the system. This simple search is translated behind the scenes to multiple queries going to different data sources and looking for different metrics (in the below example, data source names are bolded, metric names are in green):
PQL has just made it possible for the user to find all the data without creating a complex search query which would require knowledge not only of the query language itself but also a full understanding of the data that exists in the system and where.
How tags simplify your search
To do this OPSA needs to know the specific metrics in specific data sources are associated with the terms users may search for. For example, in the case above how do we know that the metrics “fs_space_util_peak” or “mem_swap_util” are relevant for a “performance utilization” search?
Operations Analytics achieves this through tags that add meaning to the metrics. These tags enable the subject matter expert to add his knowledge to the system and improve its behavior.
Using tags we can provide a common terminology across metrics names and collections. In our example tagging “cpu_util” in the “oa_sysperf_global” data collection, and “i.utilization” from the “sitescope_cpu_metrics” with the same “performance” tag means that the search for “performance” will locate these different metrics regardless of their name or source.
Furthermore we use tags to identify entities such as Host, Database, and Application. The tags allow us to look at similar entities across different data sources without having to know how the data is physically stored in the system (e.g. column name, field ID etc.)
For example: different data sources may provide information on a host, and the host name may be stored under different fields / names.
In NNMi the host name is stored in the property “node_name”, in OMi its "HOSTINFO_DNSNAME”, and in SiteScope "target_name".
The only way for Operations Analytics to locate all the information on a specific host, is to know where to locate the host name in the different data sources. When configuring the collection we use the “host_name” tag to obfuscate the specifics of each data source, and provide a single property that can be used across all the collection inputs.
For example you can see metadata for NNMi collection:
This way the search suggestions for a host will include hosts from all the data sources (e.g. SiS, NNMi and OMi):
Tagging is the means by which we can add common meaning to the disparate data sources in OPSA. Tags create common names for different metrics across data sources so they can be searched for together. Tags can tell the system where the entity names are stored across collections so a search can locate entity data regardless of the actual database structure.
Operations Analytics comes with tags configured for its out of the box collections, but users can easily add their own tags for additional collections they’ve created and to support searches they require.
HP Operations Analytics allows you to store all of your data, and provides a simple and intuitive search language (PQL) so you can find the relevant data that you’re interested in.
Tags are used to simplify the search, and provide an easy way to search across different data sources.
As we hinted at, you can also add your own custom collections by putting tags in place to easily search for the information you need in the data you want. Stay tuned for another blog that will tell you how to do this.
Architect and User Experience expert with more than 10 years of experience in designing complex applications for all platforms. Currently in Operations Analytics - Big data and Analytics for IT organisations. Follow me on twitter @nuritps