Conventional cooling challenges solved in unconventional manner
When Hong Kong-based startup Allied Control embarked on building the world’s largest 64 kW Field Programmable Gate Array (FPGA) system, it seemed that air cooling was the only practical way to deal with the issue of thermal management. Fast forward just two years, and the company is rapidly becoming the go-to-specialist for passive two-phase immersion cooling, with their very own immersion cooling platform in a modular 19-in. server rack form-factor.
“As we were researching alternative ways to cool our FPGA cluster, we have looked at various other ways such as water or oil cooling. That’s when one of our engineers happened to show us a video on YouTube from 3M,” says Kar-Wing Lau, Allied’s vice president of operations. “The video described a new passive liquid immersion cooling technique that offered dramatic efficiencies in energy usage and power density, without the need for complex and expensive sealed vessels, cold plates, heat sinks, or node level cooling hardware. The kinds of efficiencies they were projecting almost seemed like science fiction, but we felt it was worth a look.”
Following a series of consultations with Phil Tuma, 3M Advanced Application Development Specialist, Allied chose to incorporate 3M’s new passive two-phase immersion cooling technology into their design. Less than six months later, their FPGA system was ready for deployment. The results were eye-opening.
“The system uses less than 10% of the electricity that would have been required for a conventional air cooled system,” states Lau, “offering a power usage effectiveness (PUE) rating of 1.02. Plus, by allowing higher density component packing, and eliminating fans and other hardware, we reduced the footprint of the unit by 87%.” Lau notes that over 6,000 fans and heat sinks were part of the original design. He further comments that additional high-powered computer room air conditioners would have been required if conventional air cooling had been used. All of these considerations are especially important in subtropical Hong Kong, where energy and real estate costs are high, and the average PUE is about 2.2 for air cooled data centers.
“Our first system also revealed that immersion cooling is not just a new method to save energy by packing hardware into immersion cooling tanks. It also has the potential to fundamentally change the way hardware is built — enabling more densely populated components without all the heat transfer constraints, making hardware designers’ lives easier. It also simplifies the infrastructure because there is no need for complex CRACs or things like raised floors,” explains Lau.
Passive, two-phase (evaporative) immersion cooling is a technique pioneered by 3M in response to the growing need for more energy efficient means of cooling electronic components in large installations such as data centers. It employs semi-open baths of non-electrically conductive 3M™ Novec™ Engineered Fluids, a family of nonflammable, non ozone-depleting and low global warming materials with excellent heat transfer properties.
|Energy Efficiency: Behind The Numbers|
The information industry uses a common yardstick to measure energy efficiency. Power usage effectiveness, or PUE, compares the amount of electricity used for IT equipment against the entire facility’s energy consumption. However, these calculations have a blind spot: the surrounding environment.
Many of today’s most energy efficient data centers incorporate free air cooling, in which outside air is ducted directly to the IT equipment. This has caused corrosion-related reliability issues in regions with high levels of air pollution. In hot dry climates, municipal water can be evaporated to cool outside air before it enters the IT equipment. These facilities require additional maintenance, large volumes of water, and low dewpoints. In subtropical Hong Kong, many large-scale data centers can only achieve PUEs of 1.5 or 1.6 at best, while facilities on other continents report PUEs of 1.2 or less. However, some of these figures are based on only partial data and do not account for facility-wide energy consumption — let alone the weather.
Without a comprehensive metric to account for the effects of local climate, it is hard to make direct comparisons of energy efficiency across regions. But in Hong Kong, Allied Control has achieved a full-facility PUE of 1.02 — easily matching or even surpassing data centers worldwide that claim similar numbers. This suggests that Novec fluid-based immersion cooling can help data centers around the world reduce energy consumption … regardless of regional climate.
In this technique, component racks are completely submerged in a bath of Novec fluid.
“Novec fluids remove heat through direct contact with the chip or other heat source,” Tuma explains. “This raises the fluid to its boiling point. The vapor thus generated condenses back to a liquid by exposure to a condenser coil, then falls back into the bath. No energy is required to move the vapor and no chiller is needed for the condenser, which is cooled by normal facility water.”
According to Tuma, passive two-phase immersion cooling can decrease power usage by 90%, compared to conventional air cooling methods. Because of its efficiency, it allows tighter packing of components, reducing the floor space required. “It also eliminates the need for the kinds of connectors, plumbing, pumps, and cold plates associated with conventional liquid cooling,” he adds.
Based in large part on their success with the FPGA project, Allied Control began to make immersion cooling their top priority. The company found itself engaged as contact for organizations around the world who were interested in this new technology.
“One thing stands out in the many interactions with potential clients: they usually have a heat or energy crisis to solve,” Lau states. “The inquiries we get range from standard cloud hardware and financial applications to the very unusual; for instance, how do you cool an HPC cluster with a few thousand cores on a vessel in the high seas? Heat on the vessels is a major issue, as is power consumption for cooling.”
In 2013 the company took on a new and larger challenge: design and build a 500 kW data center to be installed on an upper floor of a high-rise commercial building in the heart of Hong Kong. By attempting passive two-phase immersion cooling technology on this scale, Allied would be breaking new ground.
“The client wanted a system that was both scalable and which could accommodate updated hardware without changing the cooling system or infrastructure,” recalls Lau. “And, of course, reducing energy costs and carbon footprint were also key drivers of the design, as was the fact that they were on a tight budget. All of these considerations made passive two-phase immersion cooling the optimal choice.” Lau and his colleagues also saw this as an opportunity to build a more “universal” cooling system design, using standardized components that could be adapted for a broad variety of commercial applications.
In collaboration with several key partners around the globe, Allied began to evolve a modular design that would use standard data center infrastructure, such as 19-in. server racks; server-grade hardware, such as redundant power supplies; and other standard components. The goal was to build systems that could house any high density hardware, including GPU-based, high-performance computer systems or ASIC or FPGA-based systems for the financial market or research institutions.
Because of the cooling efficiency of the Novec fluid, these components can be spaced just a few millimeters apart, enabling more computing power to be packed into a smaller area. With the use of Novec fluids for cooling, the cooling system now also out-performs any possible hardware that could be installed.
“For this data center project, we were able to deliver 75 kW capable racks. The currently installed hardware is merely scratching the surface of this capacity. With a simple upgrade we can double the capacity to 150 kW per rack. This compares to an average of 3 to 5 kW per rack for conventional air cooled data centers.” Lau adds that only slight modifications were made to these components, such as removing heat sinks and stacking them like dominos in order to save space. “This allowed us to provide the specified 500 kW on 20 racks, which take up only 15 sq meters (160 sq ft) of space, or about 1/10 the footprint of a conventional air cooled center of the same output.”
Allied’s customer took delivery of their new 500 kW data center October 2013 — less than six months after the design was initiated. “Much of the credit for this fast deployment is due to the inherent simplicity of passive two-phase immersion cooling,” says Lau. “It eliminates the need for raised flooring and other air cooling infrastructure. We also provided very detailed specifications to our German precision manufacturing partners for the few pieces of equipment that we didn’t build in-house. This saved a lot of time working with outside vendors.” He adds that the system’s modular design makes deployment more efficient; in fact, the facility was built more quickly than the client could deliver their hardware. “The entire system fits into a single shipping container. We are big believers in modular systems, both because of their flexibility and their ease of handling.”
When fully operational, it is projected that the new data center will have a PUE of 1.02, making it the most energy-efficient facility of its kind in the APAC region by a wide margin. This translates into a savings of over 95% of the electricity that would normally be used for cooling, or about $64,000 (U.S.) per month. “Because of its ability to reduce the cost of server ownership, dramatically reduce carbon footprint, and allow more efficient harvesting of waste heat we believe the widespread adoption of passive two-phase immersion cooling will enable transforma,tional advances in dense computing environments,” says Lau.
In test systems, Allied’s engineers have already demonstrated outputs of 75 kW per rack using their current designs without any changes.
“Our racks and enclosures are based on a modular design. This allows us to increase capacity simply by adding another condenser or two,” says Lau. He points out that one of the benefits of the passive two-phase immersion system is that, unlike conventional liquid cooling, you don’t have to build new customized cold plates, heat sinks, or other heat-diffusing devices whenever you update hardware. Because they are completely immersed in fluid, all components in the system are cooled equally and consistently — unlike air cooled systems, where components are subject to wide temperature variations. “Consistent cooling helps improve device performance, while helping to enhance their reliability and longevity. We are also testing various fluids with higher boiling points to save more energy, while maintaining the chips’ core temperatures below what air or water cooling would achieve in comparison.”
Allied Control views passive two-phase immersion cooling as a win-win proposition for all stakeholders in the growing field of high-capacity computing. “Consider that as much as 1.5% of the world’s electricity is consumed by data centers, with half of that devoted to cooling,” says Lau. “By eliminating 95% of the power used for cooling, this new technology can make a significant contribution to reducing CO2emissions, easing the strain on infrastructure, and saving operational costs.”
From an engineering standpoint, Lau regards passive two-phase immersion cooling as a simpler, more elegant and sustainable solution to the challenge of electronics thermal management. “It saves a considerable amount of physical space, is easier to maintain, easier to expand, and allows manufacturers to build higher-functioning hardware, without the constraints of conventional cooling. And they can do it all at a lower cost.”
Lau sees an even greater potential for the technology in the near future. “The really exciting part is that we could easily increase the amount of computing power per rack to 150 kW or more, by redesigning components so they are optimized for passive two-phase immersion cooling,” he states. “For instance, simply making boards physically smaller, and applying treatments, such as 3M™ Boiling Enhancement Coating, to various components, in order to increase heat transfer coefficients, could greatly contribute to increasing per rack output.”
To this end, Allied demonstrates a very pragmatic approach and is ready to work with leading manufacturers and industry specialists to take advantage of the new possibilities. Lau cautions, however, that the road ahead will not be easy. “Building a practical 100 kW server rack requires more than a good cooling solution. It also presents a number of challenges that will require the specialized knowledge and expertise of many industry experts.”
Hong Kong’s ambitions to become a major global eHub are garnering a lot of attention for the new data center and for Allied Control’s growing stature as immersion cooling specialists. “Energy efficiency is key to attract and serve major data center clients to Hong Kong,” says Lau. “China is also struggling with problems of pollution and lack of infrastructure. These are key reasons why Allied Control is focused on becoming the ‘go-to’ experts for immersion cooling solutions.”
Lau previously headed a successful German IT consulting firm and worked for a top 10 global logistics and supply chain company. He is confident that the work of his small team of international engineers will have an impact on data centers in and out of Hong Kong — a fact that he stresses is very important, considering the city’s pollution problems and high rents.