Even after decades of collective efforts to simplify data center management, many IT and facilities teams still feel that they are struggling just to keep up with the rapid rates of change and staggering growth. Ironically, the data centers that face the biggest challenges often rely on manual approaches for capacity planning and other critical aspects of operations management.
Intel recently commissioned Redshift Research to reach out to data centers in the U.S. and U.K. and evaluate the levels of adoption for data center infrastructure management (DCIM) tools. The survey also aimed to identify the obstacles holding back automation, and the major operational issues that burden IT and facilities teams.
BIGGEST SURPRISE: PREVALENCE OF MANUAL METHODS
It was expected that some data centers would report that they still rely on manual measurements and spreadsheets for capacity planning and forecasting. The percentages, however, were surprising. More than 40% of survey respondents admitted that they lack any DCIM tools for these tasks. Of these data centers, 56% reported that manual planning efforts consume between 40% and 60% of their time.
The larger-than-expected numbers were also, surprisingly, the same even among the larger data centers with more than 1,500 servers.
When asked why they haven’t automated capacity planning and other related tasks, 46% cited expense as the major inhibitor. Another 35% of the manual planners were concerned that DCIM would be too difficult to deploy.
FLIP SIDE: DCIM SUCCESSES
Every data center, with or without DCIM tools, reported similar challenges such as floor space constraints (indicated by 75%) and power constraints (63%). The respondents without DCIM acknowledged that they struggle with basic planning in both of those areas. Specifically, 32% said that they lacked actionable data for both short-term and long-term decisions.
In contrast, the survey highlighted some areas where data centers are employing DCIM to make better decisions. In these areas DCIM is changing from a perceived luxury to a mission-critical tool:
- Improving cooling efficiency. During the past year, 57% of the data centers had thermal-related challenges impacting operational efficiency. The survey found that DCIM analytics were being used by 63% to drive up cooling efficiency, avoid related failures, and gain significant cost savings in terms of reduced power for cooling. Without DCIM analytics, data centers are less likely to be able to perform CFD simulations for airflow in the data center. In fact, one in five of the data centers without DCIM analytics rely exclusively on manual readings of rack thermal sensors and spreadsheets.
- Analyzing downtime. Thermal problems, along with power supply and other hardware failures, contribute to outages and downtime. Among data centers that use DCIM extensively for both planning and monitoring cooling efficiency, 72% were able to quantify their cost of downtime at an average of $28,900 per outage. Of the non-DCIM adopters, only 14% could quantify this cost.
- Speeding recovery times. The results in this area also varied dramatically, with 21% of DCIM users reporting typical recovery within two hours compared to the average recovery time of slightly less than eight hours.
LESSONS TO BE LEARNED
The survey gives data center managers and DCIM vendors alike some reasons to rethink current solutions and approaches to common data center challenges. Solution providers have to wake up to the fact that many data centers are hanging onto outdated, labor-intensive alternatives. DCIM tools need to be easy to deploy and easy to use, and vendors need to do a better job of promoting progress in these areas.
Data center managers in turn should be paying attention to the quantifiable results being achieved with DCIM. The survey points out just a few of the multiple ways that DCIM is being applied to reduce operating costs, save time for data center managers, and improve the efficient use of power and space in the data center.
The DCIM tools available today give data center managers a lot of choice. It is not necessary, or even smart, to start big. Look for tools that make it easy to leverage the power and thermal telemetry data that is already provided by today’s data center equipment. Middleware for automatically gathering and aggregating the data can be cost-justified by any sized data center, and deliver immediate cost savings by identifying wasted energy and ineffective cooling.
For the long term, DCIM solutions introduce a foundation for automation that every data center team needs to consider in lieu of the growing complexity of balancing user demands with space and power constraints. It is inevitable, as the Internet of Things drives up the size of the average data center, that day-to-day monitoring and capacity planning will continue to take more time for those that cling to spreadsheets and tape measures. Any steps taken to automate data gathering will open the door for data analysis and better decision making for future capacity in the increasingly dynamic data center space.