Water-cooled chips in Supercomputer Aquasar. |
Our newest cloud computing facilities have allowed us to develop our “Best in Class” achievements in data-center facilities design and have brought us into an era where PUEs of less than 1.1 are now commonplace. In the years to come, we expect to see software applications and computer network strategies change in ways that will make many more of our computer operations suitable for cloud-computing environments and for the “Best in Class” power and cooling solutions we now know we can successfully deploy.
PUEs are now so low it is becoming evident that the next advances in data center energy efficiencies will come from somewhere other than power and cooling, and that the “next generation” of efficiencies will come from the processing technologies themselves. This is where our “leadership class” supercomputers will guide our way to energy efficiencies beyond where PUE can take us.
Today’s newest supercomputers are achieving much higher compute densities that require much less power, space, and cooling to accomplish the same work performed by older computers. These advances in computing technologies will drive changes in our computing and data center facilities support infrastructures and will result in exciting new products and strategies to better power and cool our data centers of the future.
Recent research involving superior materials, cooling strategies, and testing methods are leading us to use much higher operating temperatures for supercomputers. This improvement alone will allow us to move beyond current compressor-based cooling technologies and to facilitate the re-use of heat, resulting in solutions that are much more energy efficient, less carbon intensive, and more cost effective than ever before.
For example, new improvements in processor architectures are allowing for more direct methods of heat removal that will, in turn, support both higher power densities and higher operating temperatures. These technologies are already leading us to replace air- and refrigerant-cooling systems with water and dielectric fluid-based systems that remove heat with many times the heat capacity and removal rate than does air.
Following the lead of these advances, the High Performance Computing Energy Efficiency Committee, (see http://eehpcwg.lbl.gov) made up of HPC users across the country, has established a roadmap for improving our supercomputers of the future. The direction that they have established is based upon the pretense that “liquid cooling is the key to reducing energy consumption for this generation of supercomputers and remains at the lead on their roadmap for the foreseeable future. This is because the heat capacity of liquids is orders of magnitude larger than that of air and once heat has been transferred to a liquid, it can be removed from the data center more efficiently. The transition from air to liquid cooling is an inflection point in the model for heat removal from computers providing an opportunity to work together collectively to set guidelines for facilitating the energy efficiency of liquid-cooled High Performance Computing facilities and systems.”
This same group, under the leadership of Lawrence Berkeley National Laboratories, recently published and presented a white paper at the SC-11 Supercomputing Conference in Seattle, WA, titled, “Hot for Warm Water Cooling.” The white paper presents the development of guidelines for warmer water cooling temperatures in order to standardize new facilities and computing equipment that will provide for greater opportunities for the reuse of waste heat. These guidelines have been delivered by the HPC group to the ASHRAE sub-committee for liquid cooling and may help establish a global standard over time (see http://gaia.lbl.gov/btech/papers/5128.pdf).
IBM is also pushing the envelope by moving beyond the air-cooled systems in their line of advanced Blue Gene Supercomputers and developing an advanced water-cooling system known as Aquasar. The Aquasar computer uses 40 percent less energy and saves 85 percent carbon emissions compared to comparable air-cooled computers. It operates at temperatures of about 185ºF and removes heat with a steady flow of warm water running through micro-channels in the electronics. No chillers or other compressor-based cooling is required to manufacture the supply water, and the waste-heat return water comes out of the computer at 140° to 160ºF, ready to be used for productive applications such as floor slab heating in commercial spaces and heating for hot water heaters.
And finally, Clustered Systems, last year’s winner of “Chill-Off 2” is finding a home in high-performance computing environments as well with racks capable of housing power densities of 100 kW per rack … that’s 3,000 or 4,000 watts per square foot. Its conductive heat transfer technology removes heat quickly from blade servers, more effectively than other methods and takes then transfers the heat out of the data center space through liquid mediums (see Mission Critical, Nov/Dec 2010, p. 26. Zinc Whiskers).
CLOUD OPERATORS CAN LEARN FROM HPC
Many believe that it is inevitable that many software applications will be improved in the not-to-distant future so that they can operate in either public or private cloud environments. As that happens, more and more data centers will operate using less cooling and less backup power, and someday, without chillers or UPS systems, as some of the advanced solutions now being developed for HPC find their ways into our data centers of tomorrow.
In fact, the most forward-thinking cloud data center operators are already preparing for the changes to come by designing flexible and efficient power and cooling systems into their newest internet facilities. Their newest data center designs integrate containers and conventional data center modules, deliver most any combination of air and water cooling to their aisles and racks, and provide various Tier levels of power and cooling and network reliabilities to virtually any point in the critical space. So, when you are ready to design your next data center, please plan ahead so that water-cooling systems can be adapted and so that multiple Tier levels of electrical power can be provided, all for the high performance computing technologies to come.
And consider using well-established measures of performance when making key design decisions. For example, “flexibility” and “efficiency” in operations and “economy” in construction are benchmarks that should be set early in the design of any facility. “Flexibility” can be measured by Total Cost of Ownership, “efficiency” is measured by PUE, and “economy” is best measured by cost of construction per megawatt of critical power.
CFRT is a non-profit organization based in the Silicon Valley that is dedicated to the open sharing of information and solutions amongst our members made up of critical facilities owners and operators. Please visit our Web site at www.cfroundtable.org or contact us at 415-748-0515 for more information.
Reprints of this articleare available by contacting Jill DeVries at devriesj@bnpmedia.com or at 248-244-1726.