Figure 1. ASHRAE Table 4 provides environmental specifications for equipment


Editor's note: Figure 3 in this article is a corrected version of a table that appeared in our May/June issue (Figure 6, p. 48).

 ASHRAE Technical Committee (TC) 9.9 recently released a white paper titled, “2011 Thermal Guidelines for Data Processing Environments-Expanded Data Center Classes and Usage Guidelines,” (which can be downloaded free from www.tc99.ashraetcs.org). The key data in this white paper will be included in the third edition of ASHRAE’s Thermal Guidelines for Data Processing Environments. The white paper responds to increasing requests from the industry for ASHRAE to expand its temperature ranges for data centers.

An article in Mission Critical's May/June2011 issue, “Things are getting hotter!” was a first look at the white paper.

A simple way to operate a data center is to stay within the recommended range (see figure 1) published in the white paper. However, ASHRAE opens up many possibilities to reduce operating costs and significantly save capital costs through environmental specification tradeoffs with the hardware failure rates (see figure 2).

There are multiple ways to evaluate these environmental specification tradeoffs, but it is not something to be applied casually, and the information provided by ASHRAE should be carefully modeled and applied. The following proposes one such method to perform this evaluation.

STEP 1A - WEATHER DATA

What temperature weather data should be used? On the surface, this seems like a simple and easy decision. However, there are a number of variables associated with the climate conditions. The first decision is what timeframe to use for the weather data:
  • Last year

  • Average year for a particular period of years (e.g., 5, 10, 25 years)

  • Maximum year for a particular period of years (e.g., 5, 10, 25 years)


Figure 2. Temperature impact on volume server hardware failure rate

There are databases and tools such as from ASHRAE that store and retrieve weather data for locations throughout the world. Depending on the location of the data center, it may be necessary to select more than one location and then develop a methodology to combine that data to best reflect the weather of the data center location.

Weather is always changing; therefore, it is important to not invest in a level of accuracy in harvesting the data that is beyond a practical limit. Further, there is the consideration for the issue of climate change or global warming. If the data center stakeholders believe that climate change is occurring and should be considered, then weather data adjustment factors should be applied.

Although one method would be to immediately pick one of the three obvious timeframes, a better approach is to retrieve the data for all three using three time periods (5, 10, and 25 years). The output from these data is typically stored in bins. The bin method simply selects a temperature or temperature range and labels that a bin. The number of hours that the data fits that temperature or temperature range is recorded in that particular bin.

Figure 3 is an example of a bin output. The temperature bins are Bin 1 (59°F to <68°F), Bin 2 (68°F to <77°F), Bin 3 (77°F to <86°F), and Bin 4 (86°F to <95°F) for Chicago.

For Step 1a, the temperature bins of figure 3 can be used or another perspective can be added by assigning a bin to each data center class (A1 to A4) to clearly highlight how many hours are considered within or beyond the class temperatures (see figure 4).

For data center owners and operators who have facilities at multiple locations, an interesting comparison is to inventory all existing facilities and contemplated new facilities using figures 3 or 4. These examples (figures 3 and 4) assume an air-side economizer system is in place with an approach factor of less than 3°F, for existing facilities the relevant approach temperatures of the actual data center cooling systems should be utilized.

STEP 1B - WEATHER DATA

What humidity weather data should be used? Step 1b is exactly the same exercise but applied to humidity. Humidity conditions can be just as important as temperature conditions. Technically, humidity is best analyzed using wet bulb temperatures, but a general sense of humidity conditions may be achieved using percent relative humidity (%RH).

NORMAL OPERATING CONDITIONS

The determination of the normal operating temperature for a data center or data center zone (there may be different operating temperatures on a zone basis due to differing needs) should be based on data from various sources and determined by involving all key stakeholders to determine what the company standard should be for normal operating temperatures. Some considerations include:
  • Published ASHRAE recommended temperature range

  • Published ASHRAE allowable temperature range for each class

  • IT OEM published data and warranties for ASHRAE allowable temperatures for each class

  • Regulatory and company policies regarding temperature conditions for workers including duration under those conditions
This evaluation could easily be translated into a roadmap that plans an incremental transition to higher operating temperatures.

Step 2b. Select normal operating humidity. 

Figure 3. Time-at-temperature weighted failure rate calculation for IT equipment in Chicago (based on 2010 weather data).

Step 2a is repeated but applied to humidity. In the case of humidity, it needs to address a policy on both percent relative humidity and dew point.

Step 2c. Select maximum rate of rise

Step 2a is repeated but applied to temperature rate of rise. Since ASHRAE and the IT OEMs publish different rates of rise for servers vs. drives, it is important to address both individually. This may also expose opportunities by organizing the data center and the subsequent deployment of IT equipment into zones including zones with different rate of rise limits.

STEP 3 - EXCURSION OPERATING CONDITIONS

Within this context, excursions are conditions that exceed normal operating conditions. The normal operating conditions in Step 2 can include an allowable deviation range from the operating set point.

The excursions are not necessarily the conditions immediately outside of the operating point (with the allowable deviation or tolerance) but rather established operating environmental limits that trigger the need for monitoring and limitation.

As demonstrated in the ASHRAE white paper, the hardware failure rate varies based on temperature and on the duration at a given temperature. In general, the colder the operating temperature the fewer failures. Colder temperatures result in fewer failures until the 59°F threshold is reached. Below that temperature other issues, such as condensation, can become more problematic

Consequently, an excursion needs to be defined by the both the environmental condition and the duration. An example of an excursion is the combination of operating at 90°F for four hours per year.

There are many possibilities regarding excursion policies. For example, the excursion policy/strategy could be:
  • IT inlet air conditions shall be in the bin range of (77°F to 86°F) for no more than 50 hours per year

  • IT inlet air conditions shall be in the bin range of (86 °F to 95°F) for no more than 10 hours per year

  • IT inlet air conditions shall not exceed the maximum recommended temperature of 80.6°F for more than 100 hours per year and shall never exceed the A1 allowable temperature of 89.6°F.

Another variation would be to set the limit on hours to an IT equipment refresh cycle value (e.g., 3-year period) rather than on a per year basis. The definition of excursions and associated strategy can be particularly important when cooling service level agreements (SLAs) are in place.

STEP 4 - IMPLEMENTATION

The steps outlined previously are a method to correlate weather data with normal operating conditions and define allowable excursions through an excursion strategy.

The next step is considering how to implement this evaluation in the design and operation of a data center, including the IT equipment selection. As discussed within the white paper, business decisions regarding energy efficiency, reliability, and first cost much be incorporated into the evaluation and implementation.

CASE STUDY 1

Consider a data center design that will utilize an air-cooling system with air-side economizer and chilled water cooling. The operating temperature shall be allowed to fluctuate based on the outdoor air temperature but shall be maintained below the maximum the ASHRAE recommended temperature of 80.6°F.

Based on these conditions, free-cooling can be achieved for almost 95 percent of the year (significant operational cost savings). However, the chilled water plant needs to be sized for the entire capacity of the data center.

Figure 4. Time within recommended and allowable IT equipment inlet temperatures for Chicago (based on 25-year average weather data) - developed by DLB.

Now let’s consider the implementation of an excursion strategy such that IT inlet temperatures above 80.6°F are acceptable for extreme cases (say up to 50-hours per year) provided that the temperature does not exceed the A1 allowable temperature (89.6°F). This results in a reduction in required mechanical cooling capacity by approximately 25 percent and significant reduction in capital costs.

CASE STUDY 2

Consider a new data center that will utilize an air-side economizer. In order to reduce capital costs it is proposed that there shall be no mechanical cooling system (e.g., chillers); IT equipment rated for class A2 shall be deployed. The analysis determines:
  • The operating temperature shall be allowed to fluctuate based on the outdoor air temperature and the annual X-factor is calculated to be ~1.00, so no significant impact on reliability is anticipated.

  • The weather data predict that the maximum allowable temperature of the A2 equipment will be exceeded, voiding warranties.

  • For many hours per year, the temperatures within the data center will be such that OSHA regulations do not allow continuous light work to be performed.
Consequently, it is determined that the elimination of all mechanical equipment represents too much of a risk. Subsequently, it is agreed that acceptable maximum operating temperature is 86°F with up to 25 hours of excursions per year provided that the temperature does not exceed 95°F.

This provision results in a 65 percent reduction in the mechanical cooling plant capacity when compared to a more traditional approach.

CONCLUSION

The application of the new thermal guidelines provides an opportunity for significant capital cost, and operational cost improvements without negatively impacting IT equipment failure rates. However, successful application requires that weather data and operating data need to be reviewed with more rigor and detail than is the current industry norm and further consensus between the key stakeholders in regards to the operating environmental conditions needs to be achieved.

The payback for a successful application can be incredible both financially and in operating performance.