Improving Data Center PUE Through Airflow Management
Rising energy prices and growing concerns about global warming due to carbon emissions combine to increase the need to lower the power usage effectiveness (PUE) of data centers worldwide. The PUE of a data center is defined as total facility power/total IT power. Total facility power comprises all the power delivered to the entire data center, and the total IT power is defined as the power that is delivered to the IT equipment. A careful look at this ratio (Figure 1) reveals that power to drive the data center cooling system (45 percent) and the power consumed by the IT equipment (30 percent) dominate total facility power.
If power measurements of this equipment are not feasible, estimates must be made using detailed knowledge of the cooling equipment. For example, the cooling supplied by the equipment can substitute for power required by the cooling equipment. In this sense, the relationship becomes a ratio of the total cooling supplied and the IT power. This ratio can be defined as the “cooling supply to IT load ratio.” Driving the ratio of these two parameters as close as possible to 1.0 will drive the PUE in direct proportion.
Computational fluid dynamics (CFD) can help illustrate the point using a hypothetical data center of 2500 square feet as illustrated in Figure 2. For this data center, eight Liebert FH600C cooling units provide total cooling capacity of 1724 kilowatts. The thermal load consists of six rows of equipment racks, each row containing 20 racks, and each rack with a thermal load of 7 kW for a total of 840 kW. This results in a cooling supply to IT load ratio of 2.0, a full 100 percent higher than should be required to cool the equipment. Notice, however, that the airflow supplied by each of the eight FH600C units is only 17,100 CFM, creating a total airflow capacity of 136,800 CFM. Each 7-kW rack requires 1091 CFM to keep the temperature rise across the rack to a 20 F maximum, so with 120 racks in the room, the total rack demand is 130,920 CFM, nearly 5 percent more than the supply. This will become a significant consideration when attempting to reduce the overall power consumption.
To optimize the PUE for this data center, the cooling supply to IT load ratio must be reduced to as close to 1.0 as possible. The Liebert FH600C uses an 11-kW centrifugal blower to supply air to the data center. If the cost of electricity were $0.10/kWh, the annual cost of operating just the blower for this unit would exceed $10,000, and would be nearly twice that amount when including the work done by the compressor. Shutting down one of these units would reduce the PUE and save money. The question, however, is whether or not this can be done without causing excessive temperatures at any of the server inlets? While shutting down a CRAC unit looks like a viable option, only a CFD model can identify which CRAC is the best one to shut down and whether doing so will result in troublesome hot spots on any of the equipment.
Figure 3 illustrates the rack inlet temperatures in the data center with all CRACs operating normally. There are already hot spots located at the ends of the rack rows. In some cases, the rack inlet temperatures exceed the ASHRAE recommended maximum of 80.6 F. The maximum ambient temperature in the room for this case is 96 F. Turning off both the fan and coil on any of the eight CRAC units could cause extreme temperatures even though the total cooling capacity would be sufficient, due to the lack of proper airflow to some servers. Using CFD is a straightforward way to test this possibility and to determine the best CRAC to disable.
Improving Thermal EfficiencyThe two common methods for improving the thermal efficiency of data centers are hot- and cold-aisle containment. Cold-aisle containment is typically less expensive to implement because perforated tiles are often located near the rack inlets and therefore less ductwork is required. Also, containing the cold supply air drives up the ambient room temperature. Depending on the resulting room temperature, this approach may not be comfortable for service technicians or administration personnel working in the room.
The CFD model can be quickly modified to consider each scenario so that these methods can be evaluated.
Table 2 shows a comparison of the two approaches using the maximum rack inlet temperature and maximum ambient room temperature as common metrics. In both cases, no other heat sources in the room were included, and a small amount of leakage was permitted through the containment walls. Such leakage is inevitable because the racks demand more air than the CRACs can supply, so there is recirculation into the cold-aisle when that strategy is used or recirculation out of the hot-aisle when that strategy is used. Both containment methods drop the maximum rack inlet temperature down compared to the original case. But for this data center, the hot-aisle containment strategy is preferable. The difference between the strategies has to do with mixing. The air that leaks out of the hot-aisle mixes with the room air, increasing its temperature. The air that leaks into the cold-aisle has the same effect. However, better mixing in the hot-aisle case leads to lower maximum temperatures at the rack inlets while poor mixing in the cold-aisle case allows hot spots at higher temperatures to occur at the rack inlets. While this behavior is not generally true, for this particular data center, hot-aisle containment appears to be preferable. In short, a hot-aisle containment scheme gives rise to a maximum inlet temperature of 77 F, so sustained operation using seven cooling units is feasible.
Without any containment the CRAC failure analysis predicted worst-case rack inlet temperatures as high as 91 F. However the hot aisle containment solution also increased the reliability of the data center to an “N+1” level of sustainability. This means the data center can be run with all eight CRACs on, and if any single unit fails or must be taken down for servicing, rack inlet temperatures will not exceed 77 F, which is well within the ASHRAE rack inlet temperature standard.
In summary, this particular data center illustrates how CFD can be used to compare some of the many techniques available to improve PUE. When striving to improve PUE, data center managers should focus on the CLF as a primary target, along with the purchase of Energy Star equipment when replacing or adding equipment. If cooling power values are not readily accessible, the cooling supply to IT load ratio will work as well. Using this ratio, CFD can be effectively used as a decision support tool to compare and contrast alternative approaches. Of course, modeling makes assumptions that must be validated with measurements to ensure that the model is representing real world phenomena and is not meant to be a substitute for good engineering. Yet modeling will always produce a relative comparison of one design approach with another and is a helpful mechanism for supporting the decision making process.