1. High Density
2. Increasing PUE
3. Dynamic Hot Spots
4. Effect on Need for DCPI Redundancy
You can read more detail about these topics in last week’s blog. This week I am going to focus on some possible solutions. Please note that I will not be addressing all possible solutions for each of these areas because a) there isn’t enough space in this blog and b) I’d be foolish to say I know of all possible solutions. This is an innovative and fast moving area where new ideas are constantly cropping up. So please feel free to add.
High Density: The first step here is to assess whether or not your current cooling system can cool the new loads it will be seeing after virtualization. The total cooling load may be less but the configuration of the cooling system will likely not be sufficient. In other words, you may have the capacity, but not the cooling distribution needed for these high density racks. One option definitely worth exploring is “spreading out the load.” This approach doesn’t involve purchasing any new equipment but it does have potential disadvantages such as increased floor space consumption, higher cabling costs, uncontained air paths, and a CIO concerned with half-filled racks. Still, if space is available, this method can be relatively inexpensive. If this method is not an option then new cooling equipment may be necessary. It is possible to establish a high density “island” in your data center capable of supporting 20KW/rack or more with a thermally neutral effect on the rest of the data center. This can be achieved through the combination of rack or row-based cooling systems and hot or cold aisle containment. In addition, these approaches are extremely efficient as compared to traditional raised floor with perimeter CRAC/CRAH solutions. One client created a 20 rack, 18KW/rack “island” in the corner of a data center that supports 4KW/rack on average. Energy efficient room level approaches (containment with outside economizer type cooling systems as one example) are also possible but may not be as practical in a retrofit environment. Also note that the power distribution to the rack needs to be reviewed. While perhaps not as complex to address as the cooling scenario, three phase 208/120V or 415/240V power distribution should be considered to provide the increased per rack power draw.
Increasing PUE: As discussed previously, virtualization increases the “IT efficiency” but actually decreases the “DCPI efficiency.” After virtualization the data center’s physical infrastructure (that was hopefully optimized before virtualization) is no longer the best fit for the new load configuration. It is typically oversized at this point. The obvious answer is to right-size the DCPI with scalable power and cooling that matches the new load but there is a capex trade-off. Schneider Electric does offer a Virtualization Energy Cost Calculator Trade-Off Tool (on right) which helps you predict your energy savings both without and without DCPI improvements. For example, a 1 MW data center which is 70% loaded and non-redundant can see its total server power draw decrease by a conservative 38% with the implementation a 10 to 1 virtualization ratio. This leads to an annual savings of 17% in the total data center electricity bill (from $1.47M to $1.22M). However, if the DCPI were right-sized along with other improvements such as row based cooling, implementation of high efficiency UPS’s, etc., the annual savings would be 40% (from $1.47M to $0.88M). A payback period for those improvements can then easily be calculated. Still, there may be many reasons why an upgrade of the DCPI is just not possible in which case lower impact but more feasible actions may be taken such as using blanking panels to reduce in-rack air mixing, institute a capacity management system to help balance capacity and demand, deploy air containment, reduce fan speeds, and simply turn off unneeded cooling units and/or remove unneeded UPS modules (for scalable UPS’s).
Dynamic Hot Spots: Along with the benefits of dynamic IT loads, virtualization can also create dynamic hot spots which can potentially jeopardize the IT loads. The best cure for this issue is a solid data center infrastructure management (DCIM) software system. DCIM software can offer real-time, automated control that this type of problem demands. This software not only monitors and reports on the health and capacity status of the power and cooling systems; it can also be used to keep track of the various relationships between the IT gear and the physical infrastructure. Knowing which servers, physical and virtual, are installed in a given rack along with knowing which power path and cooling system it is associated with should be required knowledge for good VM management. This enables capacity planning (i.e., exactly where do I put my next server so a hot spot doesn’t develop), real-time, automated adjustments to the cooling system to address the potential hotspot before it happens, as well as real-time, automated shifting of the IT load to a safer physical location should the power and cooling at the current location become jeopardized.
Effect on Need for DCPI Redundancy
One of the positive effects on DCPI created by virtualization is the possibility that a highly virtualized environment may not require a highly redundant DCPI design. Basically, with virtualization the redundancy required for a highly available data center can shift to the IT equipment. Workloads, entire virtual machines, and virtualized storage resources can be automatically and instantly relocated to safe areas of the network when problems arise. It’s not perfect and there may be latency issues, but for the most part this can often occur without the application end user even knowing. This presents the opportunity to design the DCPI at an N+1 level versus 2N or 2(N+1). It’s possible to save 35% or more on the required capex to build a data center with this approach. Of course, this requires STRONG collaboration between facilities and IT very early in the data center planning process. We’ll leave that conversation for another day.
By following some of these recommendations you can maximize the cost savings you were likely virtualizing for in the first place. I look forward to any comments and/or additional recommendations. Again, if you are interested in learning more about these and other similar issues take a look at one of Schneider Electric’s newest white papers, white paper 118.