Data center owners and operators are increasingly looking for ways to minimize total cost of ownership (TCO), cost per kW of IT load, and downtime. In an industry where the average TCO overspend is around $27 million per MW, where cost per kW (of IT load) can spiral out of control within just a few short years of entering operation, and where the average cost of downtime is $740 thousand per incident, owner/operators want solutions. Using an integrated and continuous modelling process can help data center administrators save millions of dollars annually per data hall.
While the amount of operational information has grown, it has remained siloed, causing organizational and physical fragmentation. Poor planning and inefficient use of power, cooling, or space often threaten efforts to minimize costs. This can force managers into a corner: do you build a new facility to help alleviate the strain or invest in a major overhaul? This is a dilemma no owner or operator wants to face.
Let’s single out the estimated TCO of an example 1 MW facility to make the point. For such a facility, TCO should be $32 million over 15 years. The reality, however, is completely different: a skyrocketed $59 million.
What happened to make the costs almost double? In short, there was a discrepancy between the physical capacity the space was designed for, and the capacity that is actually available once hardware was deployed and additional infrastructure built out.
It’s clear that data centers have the potential to be financial black holes. To help avoid common pitfalls, we’ve identified five primary reasons why data center operations are a financially risky business:
Design chain coordination. The tendering process produces an environment where a single product (the facility) is being supplied by multiple vendors. Vendors typically don’t communicate or coordinate with each other. The resulting lack of common vision leads to problems when the data center is built and handed over.
Siloed operations. IT operations, corporate real estate, facilities engineering, etc., all plan and execute actions in their respective silos. These decisions are driven by multiple stakeholders, often with mutually exclusive interests. Such silo-based operations lead to fragmented operational processes, which in turn leads to the fragmentation and diminishment of physical capacity.
IT operations vs. conceptual design. It’s not possible for conceptual design to guarantee performance in normal operation due to changing IT and business needs. The uneven buildout of the facility over time means that most data centers will only realize a capacity utilization of about 70%.
Variable IT in a fixed infrastructure. IT hardware must be refreshed every few months or years. Newer IT hardware can have completely different requirements for space, power and cooling resources, requiring an operational redesign.
Capacity tracking. Physical capacity is dictated by the resource that is least available — space, power, cooling, or networking. For example, when cooling is utilized faster than space and power, the data center reaches the end of its life far quicker than anticipated. Data center infrastructure management (DCIM) tools provide a powerful means to monitor and track space and power. However, there are limitations. DCIM cannot:
- Model and track cooling availability
- Relate the distributions of space, power, cooling, and IT to each other to show capacity
- Predict the impact of future IT plans on power and cooling collectively
What Can You Do About It?
When your operations are siloed, the physical and organizational infrastructure of your facility can become fragmented. To overcome this, it’s necessary to build more holistic models of your facilities (and their supply chains) to integrate knowledge from these different domains. These models highlight the dynamic inter-relationship between different aspects of operations, and help to anticipate and avoid future problems.
One way to achieve this is by mapping data from DCIM toolsets into a powerful 3-D virtual facility model. The virtual facility pinpoints where problems are occurring in your real facility, and can be used to simulate potential solutions to the issues identified. This process — called engineering simulation — overcomes the traditional problems of DCIM, and allows managers to identify the biggest constraints on capacity and predict the impact of future changes.
Even the best designed data centers must contend with a changing business and technology environment. This makes engineering simulation essential: as well as identifying current problems, it also allows for the modelling of future changes to the facility. Simulating potential changes in advance reveals their wider impact on other sub-systems, meaning that infrastructure can be chosen for adaptability and expandability. This avoids the restrictions often imposed by short-term decision making, and helps to eliminate the expensive redesigns needed to accommodate technological or business changes.
Anticipating changes using a holistic model and engineering simulation allows the design and building out of the data center to proceed evenly and effectively, and enables a smoother transition to new technological paradigms. This approach to data center operations, based in engineering simulation strategy, ultimately reduces costs and increases capacity utilization.
1. A Simple Model for Determining True Total Cost of Ownership for Data Centers, Jonathan Koomey et al, Uptime Institute, 2006
2. 2016 Cost of Data Center Outages, Ponemon Institute, January 2016