IT Agility: Making Better Use Of Power Monitoring Data
Every year, IT has to do more — manage more servers, smart devices, apps, and services. Endusers continually raise the bar, demanding more performance and better response times. And while everything about the modern data center is growing, IT headcounts and operating expense budgets remain the same or shrink. Data center managers and their teams must continually find ways to work smarter and reduce the biggest expense items. Since energy costs have risen to the top of the list, looking for ways to consolidate and reduce power consumption is a good place to start.
Virtualization and cloud technologies have helped, making it possible for IT to more cost-effectively build and manage more energy-efficient data centers. However, with all of the attention focused on the high-level data center models, it can be easy to overlook some of the significant advancements at the hardware level. The latest servers, storage devices, switches, racks, power distribution units, cooling equipment, air handlers, and the myriad of other components are all feeding status information onto the network. Servers, in particular, have also evolved to give remote IT teams many new, fine-grained monitoring and control options.
At first glance, the built-in network-based monitoring and control functions are interesting, but IT teams certainly can’t afford to be continually querying and adjusting individual devices. Manually collecting enough data points to identify patterns and trends would be even more impractical. Solution providers have consequently evolved system consoles and data center dashboards to take advantage of middleware technology that automates the collection, aggregation, reporting, and logging of a broad range of device status information.
Many of the world’s largest data centers now take advantage of this type of all-software data center instrumentation. Highly automated IT practices include monitoring real-time server inlet temperatures and power consumption data from rack/blade servers, PDUs, and UPSs. Airflow is also monitored.
Best-in-class holistic energy management solutions consume this information and turn it into energy and thermal maps of individual server rooms and data centers. Combined with control capabilities such as power capping and dynamic server frequency adjustments, the aggregated intelligence is helping data center and facilities teams better understand and manage energy costs, and make better decisions.
For example, the real-time power data improves capacity planning. In the past, IT had to rely on the manufacturer’s specifications for peak power, estimate a de-rated power specification, or carry out bench tests with simulated workloads. With logged power consumption data gathered from production servers, IT can now more confidently and aggressively provision new servers and racks. Power capping capabilities, another feature available with today’s modern servers, can be applied to make sure that the more densely populated racks do not exceed the maximums. Without the risk of subjecting equipment to damaging power spikes, the higher rack densities ultimately reduce the number of required racks as well as floor space and cooling.
Besides more accurate capacity planning, making use of the available power, temperature, and airflow status supports improved:
- Better data center operations. Besides identifying hot spots, thermal maps can point out overly cooled aisles. IT and facilities teams can also take advantage of increased visibility of power and temperature behaviors to adjust the ambient temperature in server rooms. Since a single degree increase translates to significant annual savings in term of cooling costs, many data centers are embracing hotter room levels, especially since modern servers and data center equipment is rated for higher temperatures.
- Asset utilization/consolidation. Power consumption patterns can highlight “ghost” servers or those servers that are idle or under-utilized. Since an idle server draws approximately 50% of its maximum specified power requirements, being able to evaluate server utilization patterns can lead to major savings. IT teams can consolidate servers or introduce on-the-fly adjustments to put idle servers into more power-conserving sleep modes.
- Workload scheduling. Job scheduling, even within highly virtualized environments, can be carried out while considering the impacts on overall power consumption. Power-aware virtual machine migration and job assignments support more energy-efficient operations, and raise awareness of the energy costs associated with individual tasks or organizations’ workloads.
- Equipment lifespan optimization. Real-time thermal maps vividly highlight hot spots, and put IT in a proactive position for avoiding any damage to the most heavily loaded and mission critical servers. The same power capping features that help protect densely populated racks can also mitigate thermal issues that would otherwise damage or shorten the life of servers and other data center equipment. Alternatively, IT and facilities can adjust cooling and airflow systems to address and eliminate the hot spots or shift workloads to avoid them.
- Business continuity. Armed with a better understanding of the actual power requirements associated with various services, systems, or groups of users, IT can adjust disaster recovery plans to more intelligently allocate back-up resources or shift workloads during outages. More intelligently allocating power can extend the life of back-up power supplies by up to 25% based on actual experiences reported by many data center operators.
- SLA management. IT can establish power policies that guarantee the optimal execution of the high-priority services. Automatic threshold management can flag when systems, racks, or rows are approaching limits, giving IT the ability to proactively adjust resources before limits impact service levels.
- Avoidance of peak-period utility rates. Many large companies distribute data centers geographically to deliver the best possible service to each location. With visibility of the power consumption patterns, IT has the cost-reducing option of scheduling some workloads remotely to take advantage of off-peak power rates.
MORE CONTROL — LOWER OPEX
These are some of the many ways that data center teams are applying automation and middleware technology to gain more agility and control of data center resources. IT and facilities are better able to adjust and allocate data center assets while simultaneously reducing the energy costs for the data center.
At a higher level, software instrumentation for power and temperature monitoring is also being used to adjust power management policies for groups of servers, racks, rows, rooms, and entire data centers. The information helps shape “green” initiatives and conservation efforts. Some data centers also apply power data to more accurately charge-back services. Based on published results and surveys, intelligent energy management solutions are yielding 20% to 40% reductions in OPEX by eliminating energy waste alone.
Note that the improved agility and cost reductions are not the result of monitoring alone. As mentioned previously, the latest generations of data center equipment also make it possible to remotely adjust key parameters. For example, IT can dynamically adjust the internal power states and processor operating frequencies of data center servers.
While predominantly employed by IT, the software instrumentation that feeds power and temperature data into consoles and dashboards provides a visual, intuitive summary of environmental conditions that also helps facilities teams. The aggregated information, in fact, enhances collaboration between IT and facilities teams to better align infrastructure and building planning and management.
The combination of monitoring and these types of controls let IT optimally balance server performance and power, without noticeably degrading the user experience or service levels. Field tests have shown that dynamic adjustments can achieve as much as a 20% reduction in server power consumption.
Thanks to rapidly rising global data consumption in our highly connected world, data center energy consumption is also on the rise. NRDC reported that 10% of global energy use (91BKWH) is now attributed to global data centers. Power and cooling costs have become the biggest component of data center operating budgets. Gaining more visibility of the actual power and thermal patterns in the data center should therefore be considered a priority goal for any data center.
Fortunately, IT teams can introduce highly automated monitoring and control solutions to aggregate and apply real-time data that already exists throughout the data center. The significant savings in terms of avoiding wasted energy and extending the life of equipment offer a strong business case for investments in software solutions and speed deployment and start-up times. Best-in-class solutions, in fact, offer agentless monitoring and control. The easily-integrated software instrumentation minimizes the burden on the IT staff and gives both infrastructure and facilities personnel the tools they need to more effectively achieve their goals.
The latest generation of holistic energy management solutions represents a major advancement in data center monitoring and management systems and dashboards, and the middleware approach has been proven to deliver the necessary scalability to keep pace with data centers. The open, flexible software architectures also strengthen alignment with today’s flexible, on-demand service delivery models. Look for more expanded feature sets and deployment options as hardware vendors and systems integrators take advantage of the continuing evolution of intelligent data center hardware and interconnect standards.
Jeff Klaus is the general manager of Data Center Manager (DCM) Solutions at Intel Corporation where he has managed various groups for more than 13 years. Klaus’s team is pioneering power- and thermal-management middleware, which is sold through an ecosystem of data center infrastructure management (DCIM) software companies and OEMs. A graduate of Boston College, Klaus also holds an MBA from Boston University. He can be reached at Jeffrey.S.Klaus@intel.com.