The IT Management industry frequently uses the term Analytics. However, the meaning of the term is unclear and/or ambiguous. Consequently, product users do not understand the specific value they are getting from their analytics solution. This blog post will help share the specific view that I have on analytics.
Analytics in software is the task of performing some logic on collected data to provide insights which otherwise may not be obvious.
The insights are meant to:
Help resolve problems
Perform correlations to reduce noise
Predict and prevent problems
Plan scenarios for infrastructure deployment, etc..
Without analytics, humans will have to do many of these logical tasks manually which is time consuming, and likely error prone. So, what are the different kinds of analytics and why are they relevant? I see the following classification of analytics in general:
Pre-load analytics- Analytics that is done before the data is persisted. This kind of analytics is useful when time is of the essence in getting insights. For example, if an event storm is getting generated, it is important to understand the nature of the storm and eliminate the noise quickly, if a security breach is happening, it is important to detect the origin of the breach as it is happening using log messages etc. This is an area where machine learning can be used to learn more about various patterns.
Post-load continuous analytics- This type of analytics is done in the background continuously on stored data using one or more algorithms/functions. This is also an area where machine learning can be applied to learn about patterns in data. For example, seasonal behavior over minute, hourly, daily intervals. The learning can be used later for problem predictions or other applications.
Post-load adhoc analytics- This is analytics that is done on persisted data but done adhoc/on-demand. One can think of this as query-based analytics where query is used to analyze data on demand. Query is also used to filter data on which analytics needs to be applied. In this type of analytics users are expected to follow one of two paths. They either search for the data and then apply corresponding analytics on them; or they already know the data they need to analyze but are expecting to see patterns in the data. Analytical tools such as R are very useful here.