In my last column I wrote about liquid cooling and thought I did a fair assessment of the current developments and various technologies (at least in my own opinion) and their thermal advantages. However, much to my surprise, two months later (and despite my article), I found that most data centers still have not completely abandoned air cooled IT equipment and continue to wrestle with cooling issues.
This past Labor Day weekend I decided to barbecue some ribs and while I am normally in a hurry to get them done as quickly as possible (my typical 30 minutes of burn’em, turn’em, and serve’em approach), this time to avoid the usual faux complements from our guests, such as “umm… crunchy,” I decided to try slow cooking them. So I looked up some cooking recommendations on the Internet (where it seems a lot of people have opinions on how to slow cook ribs).
Having a lot of time to sit down and let the ribs cook slowly (instead of standing there with my trusty fire extinguisher), I began perusing some of the recent data center discussions on LinkedIn and saw several debates about the concerns of rising temperatures in the rear of the cabinets (this of course seemed like a good topic for “Hot Aisle Insight” to me), especially from those users considering chimney cabinets. Several of the slow cooking recommendations involved the use of a smoker grill, which has a little chimney, so I thought about those chimney cabinets and began to wonder … (more on this later).
Getting back to the to higher rear temperature concerns, on one hand some conservative types are just becoming aware of (though not necessarily comfortable) that the 2008 ASHRAE’s 2nd edition of the Thermal Guidelines expanded the “recommended” range for the Class-1 IT equipment, so that they could safely raise the cold aisle temperatures a few degrees above 68°F and the IT equipment would not suddenly burst into flames (unlike my ribs).
In contrast, other data centers have begun operating at higher temperatures, nearer to the top of recommended range (80.6°F – 27°C). Not surprisingly in some cases they suddenly found out they may have to start worrying about the rising IT exhaust and resultant rear rack temperatures. I am not saying that going from 65-68° to 75-78°F IT intake temperatures is the only cause of higher rear temperature concerns. One of the other causes many people are not generally aware of is the increasing temperature rise (Delta-T) across servers, especially in some blade servers (which can be as high as 40°F).
A bit of basic heat transfer information needs to be interjected at this point. One of the issues (and assumptions) involves elementary physics; that for every Btu, Watt, or kilowatt that you want to “cool” the amount of fluid (air in this case) that comes into contact with the object you want to transfer the heat from (or to) will cause the temperature to rise or fall in an inverse relationship with the amount of airflow (i.e., less airflow means higher Delta-T for a given amount of heat).
In the case of air cooling, this calculates as approximately 160 CFM of airflow needed per kilowatt, assuming a Delta-T of 20°F (a false assumption in most cases). This relationship is also true for the cooling system, which will remove heat from the return air at that same rate, and in general, most CRAC/CRAH systems use a 20°F Delta-T as a design parameter (especially those with fixed speed fans). However, if the Delta-T across a server is lower, the server will require more airflow, and if the fans slow down (to conserve energy) the Delta-T will be higher, assuming the same power/heat load. For example, here are some basic and common airflow and Delta-T numbers for fixed load heat load of kilowatt.
Moreover, the Energy Star program for Servers has also driven vendors to try to shave watts wherever possible, including saving considerable fan energy (fan affinity laws state that airflow is proportional to fan speed and fan power is proportional to the cube of the fan speed, Figure 1).
FIGURE 1 |
As a result, the server fan speed controllers have become more intelligent and are configured to keep the fans speed (and energy) as low as possible, while still keeping the CPU and other components within a safe operating region. Even many regular (non-Energy Star) servers use this scheme to save energy by keeping the fans speed low when possible. The result, the temperature rise across the servers and especially blade servers are higher for a given heat load when compared to when their fans are running at higher speeds, which was common on older servers.
In addition, as of the 2011 ASHRAE’s Expanded Thermal Guidelines whitepaper (and the later formal release of third edition of the Thermal Guidelines), placed more emphasis on the wider A2 “allowable” range for most IT equipment coupled with the “choose your own risk” X-Factor, indicating that even commodity servers could operate at up to 95°F (35°C). These higher temperature rating were meant to help improve facility cooling system energy, as well as promoting “free cooling” wherever possible.
Moreover, the 2011 ASHRAE whitepaper included the introduction of A3 and A4 extended ranges with “allowable” intake temperatures of 104°F (40°C) and 113°F (45°C) respectively, created to endorse even greater energy saving opportunities reducing or eliminating mechanical cooling (assuming that would spur manufacturers to build these new classes of IT equipment). While Dell was the first of the major OEMs to announce servers with A3 ratings several years ago, others waited for customer demand before releasing high temperature rated server models. IBM also had previously released a few A3 rated systems (IBM server products were subsequently sold to Lenovo, which recently offered some A4 models). HP recently announced the availability of some A3 and even A4 rated models with the introductions of their G9 servers.
Based on these new A4 models from major OEMs, I assume more customers now may be interested operating in the extended ranges. Deployment of more A3 and A4 equipment means that eventually IT intake temperatures may rise further (though not necessarily to the top of the extend ranges), which of course will result in even higher rear exhaust temperatures. However, it should be noted that the A3- A4 servers will need to aggressively speed up their fans when operated at the higher intake temperatures ranges, which will reduce their Delta-T, somewhat mitigating the final maximum exhaust temperatures.
Lower IT fan speeds improve energy efficiency in several ways (as well as reducing fan noise). In addition to directly lowering the fan energy of the IT server, indirectly it also allows the facility side cooling (CRAC/CRAH) to lower their CFM delivery requirements, thus lowering facility fan energy (controlled via a number of methods, such as static pressure, as well as temperature). In addition, higher exhaust temperatures would mean higher return temperatures to the CRAC/CRAH (especially with hot aisle containment or chimney cabinets). This would improve the thermal transfer of return air to the cooling coil, improving their performance and energy efficiency. And in the case of a CRAH in particular, it also allows higher chilled water temperatures, which in turn improves the chiller plant performance (greater capacity) and energy efficiency.
Sounds like all good things on many levels in theory, so what’s not to like? Well to start, the varying conditions presents a challenge for the facility side of the cooling system design and operation — meeting highly variable airflow requirements, as well as the elevated and changing Delta-Ts of the IT equipment — which do not match the 20°F CRAC/CRAH design assumptions (shown as “reference” on the chart). The wider ranges of IT fan speeds, which will rise and fall in relation to internal heat loads generated relative to actual computing loads (the Energy Star program refers to this as “Adaptive Cooling — cooling technologies that reduce the energy consumed by the cooling technology in proportion to the current cooling needs”) can negatively impact facility airflow management.
There are other more direct concerns that have come up. The people working in the rear of the rack are not very happy having to work with the higher air exhaust temperatures, which now could readily reach 110°F (even if the intake temperatures are only at 75°F to 80°, within the recommended ASHRAE range when coupled with higher Delta-Ts of 35°F to 40°). There are also concerns about having to touch and handle hot surfaces, especially in contained cabinets (which also could involve potential issues related to OSHA, IEEE, IEC references to “touch safe” temperatures).
The second issue involves the rack PDUs. Older PDUs and some current models on the market are only rated for 113°F (45°C), which can be reached by high intake/Delta-T combination conditions and easily exceeded if the higher “allowable” IT intake temperatures are encountered — even with more moderate Delta-Ts. Some manufactures have addressed this with the introduction of PDUs that are rated up to 140°F (60°C). This is worth considering for future PDU requirements, even if you are not currently experiencing very high rear temperatures.
Then there is the apprehension of the potential impact of higher operating temperatures on the network cabling, which may or may not be a real problem. The category rated (5-6-7, etc.) cables are typically rated to 95°F, thereafter most cabling manufacturers have frequency and loss factor de-rating — resulting in distance limitations. However, from a practical viewpoint, these de-rating factors are based on the entire cable length operating at these higher temperatures, not just a relatively small section (6 to 8 ft) which is within the rear of the cabinet (If anyone has had any actual problems or specific knowledge of this please let me know or post to my Hot Aisle Insight group on LinkedIn).
THE BOTTOM LINE
The IT equipment industry has many well-known standards (ASHRAE, IEEE, IEC TIA/ANSI, etc.) that clearly define physical form factor for rack mounting, power supply voltage, quality and connectors, network and other interfaces, as well as intake temperature and humidity ranges. This has allowed IT most equipment to be universally deployed in almost rack in any data center. However, Delta-T and CFM requirements, which directly impact facility cooling system designs and the rear exhaust temperatures, are really not well-defined as important issues or addressed as a common best practice, design target, or standard.
Currently, the primary concentration for mainstream data center cooling systems is focused on improving energy efficiency by increasing intake air to the IT equipment (within the ASHRAE recommended range and for some sites, excursions into the “allowable” ranges), while delivering the required amount of airflow to the IT equipment (or in reality somewhat more airflow, to overcome poor airflow management issues). Data center designs have changed significantly in the last few years, predominantly to improve cooling energy efficiency and to meet higher cooling densities. However, very few have really fully addressed the issue of variable Delta-Ts and airflow ranges of new IT equipment in their cooling systems.
Hopefully more industry awareness of the variable nature the new servers and other IT equipment will result in better interaction of cooling control systems with IT equipment to compensate for and moderate rising IT exhaust and rear cabinet temperatures. When used in coordination with total energy monitoring (not just PUE), an ideal operating point of IT fan and facility fan speed/energy, as well as IT intake and exhaust temperatures, can be established and maintained under varying computing load conditions found in the data center.
Finally getting back to the results of my barbeque, the ribs tasted much better without fire extinguisher foam. However, for the moment it looks like I cannot actually cook ribs at the back of the rack (the lowest of the recommend temperatures seem to be above 200°F), thus negating this potentially interesting and tasty reuse of waste heat, as well as the possibility of starting my own “Hot Aisle Ribs” chain of data center-based restaurants.