As global enterprises grow, they must also transform their IT operations. That’s because two of the major drivers for growth require enterprise data centers to migrate their on-premises delivery and deployment platforms to a hybrid public/private cloud model and evolve from a reactive operational model to a proactive and, eventually, autonomous approach.

In order to achieve full-stack hybrid cloud visibility, application-centricity, and web-scale operations, mission critical deployments will require an amalgamation of big data analytics, real-time monitoring, and both AI and machine learning (ML) technologies.

But can these lofty ambitions be met in today’s clouds? Some experts would say no, pointing to dynamic resource allocation (DRA) as the missing piece of the puzzle. I would go a bit further, though, and argue that DRA is the beginning of the fourth generation of cloud management.

The Scaling Challenge

Clouds, by nature, must be able to scale elastically in order to support fluctuations in application workloads. DRA depends on horizontal scaling, which, in turn, depends on application architecture. Features such as auto-scaling allow organizations to scale according to the overall application workload or node-level resource utilization.  So, it’s becoming clear that DRA is the linchpin to supporting these more complex infrastructures. 

“Enterprises are looking for application and cloud service providers to help them operate more efficiently through the use of machine learning and artificial intelligence to deliver more effective resource management,” said Frank Jablonski, vice president of global marketing, SIOS Technolgy. “Achieving this will require the environment or application to understand when it needs more resources and then automatically scale up those resources to meet the increased demand. Conversely, the technology will need to understand when specific resources are no longer needed and safely turn them off to minimize costs. Today, such dynamic resource allocation can be unreliable or must employ an inefficient manual process, forcing cloud customers to either spend more than necessary or fall short of meeting service levels during periods of peak demand.”

Cloud Evolution and DRA

To understand how we’ve gotten to where we are today and why DRA has been identified as the next driver for growth in cloud optimization, it’s important to understand how the cloud has evolved over the last few years. The first generation of cloud optimization services focused on bill analysis. The second generation focused on cross-analyzing cost with capacity. The third and current generation encompasses the previous two generations as well as hybrid infrastructure monitoring and capacity planning. This generation yields recommendations on whether workloads will perform best on-premises or in the cloud (according to what is more cost-effective).

DRA has been identified as the enabler of the fourth generation of public cloud optimization because it gives enterprises the ability to offer stringent SLAa, meet their performance and availability objectives, autoscale, and deliver real-time responsiveness.

But let’s look even deeper. The most effective form of DRA builds on ML technologies that detect unusual jumps in workloads. The same technology can also detect sudden changes that may not be due to a legitimate increase in application workload but, instead, may represent a cyberattack. If auto-scaling occurs during nonpeak hours due to a workload increase in network, CPU, memory, or I/O, the security team is alerted immediately.

the fourth generation

The fourth generation of the cloud is driven by workload automation technology that analyzes and predicts performance and capacity requirements while managing costs. But in order for it to proliferate, planners must address security and governance issues on the infrastructure. Why? Because third-party vendors need access to the infrastructure to analyze workloads and change control processes for mission critical applications.

This level of sophistication will require refactoring legacy applications to support horizontal scaling for at least the front-end portion of the application, measuring capacity utilization in real time by using ML to detect sudden increases in workload, and implementing anomaly detection for auto-scaling in order to detect suspicious resource utilization.