According to a 2016 survey conducted across 100+ IT professionals, capacity planning used in agile and on-demand IT environments is far from ideal.
Key points from the survey:
capacity planning capabilities for core software (email, CRM, ERP, etc) and customer facing software (portal, mobile back-end etc) is considered "very effective" by less than 30% of the respondents
server, network and storage - infrastructure capacity planning is a better story overall, but in most cases more than 40% respondents feel that the the planning is "less than ideal" - exception being mainframes and converged systems where the need is not felt (most likely, the upfront investment has been high so there's abundance of resources beyond the actual needs)
50% or more respondents agree that frequently ad hoc approaches are taken for judging future needs and planning for critcial resources, monitoring the usage with static thresholds and wait until a critical alarm goes off.
In less than 10% is predictive analytics or machine learning approaches used for capacity planning.
Human involvement is key (> 85% favourable responses), and and tool guidance is less in this space.
More than 40% respondents answered favourably to lack of senior exec understanding, and securing budget for a capacity planning solution as a major issue or with some more (upto 60%) considering these a significant challenge.
Hopefully I have established the point about the lack of capacity planning as an entrenched IT practice / discipline.
It should be noted that capacity planning is still not taken seriously as typically money gets allocated for buying new hardware when the business demands it. This is compensating for lack of planning by spending money, often at increased cost of procurement and setup.
One would find IT spend highest in healthcare, financial, government and IT companies and this is not without a reason - these are eventually highest growth sectors with high margins so there's enough vitamin "M" to keep going.
This image was originally posted to Flickr by ToastyKen at http://flickr.com/photos/24226200@N00/3584188972. I added the caption on the image - it is a fun take on disaster recovery planning - i don't intend to insult or offend anyone.
Okay, so to come to the key point of this blog article - the amount of money and time invested today in fixing operational problems could be reduced and re-focused towards fixing capacity allocations and redistributing the existing resources in a SDDC world.
Operations monitoring in its true sense is reactive. Yes there's predictive alerting which is statistics based - but the same stats is what capacity planners would do - so that's really capacity planning.
In the physical server world, there's 1 server to an app or a workload - server admins and app admins preferred that the walls are tight on these - very rarely was 1 physical server used to run both Oracle and WebSphere Application server. Server utilization rates were really low. This led to a need for various types of partitioning - electrically isolated partitions, VMs, containers (which today docker has made popular, but the technology / concepts existed quite some time back - circa 1979).
So with resource sharing in heavy use, there's more dynamism in the usage numbers. Also, in the SDDC world, the CPU core and memory drive even IOPS unlike in physical world where this was done differently - no longer does an IOPS latency issue mean a storage / SAN level problem. The problem could be in the software at any level in the stack.
Would dynamic VM movement (live migration or VMware DRS) fix a capacity issue? The simple answer to this - NO. VM migration is for load balancing and it is also reactive, no way this would give back any capacity wrongly allocated.
Can capacity be pulled back from a VM at any time? No, again. Capacity assigned to a workload has to be carefully taken back looking at the past trends, the future needs and also keeping in mind other factors such as compliance for license and safe harbour.
Will spread sheets be enough to handle all of what you want in terms of planning needs? NO, manual efforts tend to be slow and error-prone. It is important to ensure that the data is obtained real-time and the calculations are done fast, to do a right planning considering the seasonal usage patterns. Also it is important to ensure that simulations on the data can be done using a analytical planning tool.
In the on-prem cloud finding a best place for a new workload is driven with capacity planning or assessment. It is surely not something the ops team would be involved in. However the after-effect of a sub-optimal placement (like when that IO intensive workload got placed on a low IOPS data storage) is an ops problem, and this could be avoided with good planning. It is a fine line between the two.
Ok but the bigger question then - will public cloud just take us out of this planning game or do we still need analytics and trending that we use in capacity planning. Since every (public) cloud has a silver lining which comes with a bill nowadays, enterprises will need to and already start considering how long to stay there, versus go for own datacenters. Dropbox did their calculations i am sure in moving out to their own datacenter, mainly due to their economies of scale. The analytics and trending that capacity planning tools offer will help in determining this - in fact entire "what-if" scenarios could be built on this.
So here's the poser - Would you rather wait for an alert to get your capacity planning done, or do you wish to be proactive?
Cloud Optimizer offers performance monitoring and capacity optimization for virtual, cloud and physical server (x64) environments in a single easy-to-install-and-use tool. Here's the quick specs.
Cloud Optimizer links into capacity disciplines as well as the operations discipline working with our suite offerings in the respective spaces - Hybrid Cloud Management and Operations Bridge. Here's an introductory video on this.
Ramkumar Devanathan (twitter: @rdevanathan) is Product Manager for HPE Cloud Optimizer (formerly vPV). He was previously a member of the IOM-Customer Assist Team (CAT) providing technical assistance to HP Software pre-sales and support teams with Operations Management products including vPV, SHO, VISPI. He has experience of more than 14 years in this product line, working in various roles ranging from developer to product architect.