|BEST PRACTICES FOR EFFICIENT CLOUD ENVIRONMENT|
|Plan for growth|
|Increase data center density|
|Increase efficiency at reduced load|
|Ensure high availability|
|Implement a path for DCIM|
Since IT managers must optimize the performance and efficiency of the data center without increasing risk of downtime, which Emerson Network Power calculates the average cost of a data center downtime event at $505,000, it’s important for organizations to consider the power and cooling infrastructure available, the level of sophistication of monitoring and controls, and variable capacities that can match infrastructure to IT demand to optimize efficiency and reduce waste.
The following best practices can help ensure availability while also maintaining an efficient cloud environment.
Build/retrofit with higher densities in mind.
Data centers are growing at such a fast rate that data center managers are having a difficult time providing inventory to keep up with demand. According to the spring 2013 Data Center User’s Group (DCUG) survey many IT managers are experiencing a number of issues limiting growth, specifically:
- Hot spots
- Running out of power and cooling
- Lack of floor space
When planning for growth, consider it from a rack density perspective. High-density computing environments pushing density to 6 or 8 kilowatt (kW) or higher per rack is challenging the infrastructure. To add new applications and expand to meet business goals, data center managers have to plan for growth and know how and where the IT assets are going to be deployed.
Start by creating a dynamic capacity infrastructure. You will be adding servers in large quantities, but frequent moves, additions, and changes are normal for you. Think about options to rapidly deploy your infrastructure. Make sure you have a scalable, flexible architecture that allows you to move not only your IT assets, but also utilization densities from a power and cooling standpoint.
When implementing these actions, you might also be deploying modular busways that allow you to connect new rack applications to existing ones. In order for the system to avoid downtime during this activity, consider two-stage power distribution. Many data center managers have no tolerance for downtime, even during sizable live updates, additions, retrofits, or other activities. That’s when two-stage power distribution can pay off. If the infrastructure is designed in such a way that you can make the moves, additions, and changes with other applications not being affected, life is much easier. You don’t have to go to a change board or involve your IT specialist to make sure they understand the implications of all the actions when making changes.
According to the spring 2013 DCUG survey, the current average for data center density was 5.94 kW, but respondents project that their data center density could top 9 kW within two years. Almost two-thirds of survey respondents had at least one rack application that contained more than 12 kW of power being consumed and the subsequent extreme heat.
When increasing data center density within a row-based infrastructure, plan to support the virtualization and high-density zones, and also the UPS, distribution, cooling, and sensors in that high-density zone. On the power side you want scalability to implement modular expansions. For thermal management, smaller, more efficient cooling directed at the point of power generation is needed. You can push efficiencies up much higher by segregating the data center into low, mid-range, and high-density zones and adjust operations and power and cooling infrastructure to best function within those zones.
As server-based compute demands grow, there is a finite physical space to expand existing data centers and the cost to build new data center space is prohibitive. Data center managers should ensure they can accommodate these usage densities on a per rack basis. Data center space can be gained and utilization rates and density increased without having to continually build out the data center. If the site is currently at 4 to 5 kW per rack and you are targeting 10 kW and are designed for internal growth of 50% to 100% in terms of capacity from a power and cooling standpoint, it’s like adding another data center. Users should completely outfit and use the existing space with high density capability in order to push per rack density. By doing so, capacity that’s already in place is unlocked and having to build another data center is avoided.
Take advantage of variable utilization with a variable infrastructure.
Not all data centers consume the same amount of power 24 hours a day, seven days a week. The amount of power being consumed in different environments can vary greatly depending on the time of day, day of the week, and time of the year. You want to be as highly efficient as possible at all loads. To do this, take advantage of variable utilization throughout your infrastructure. Cooling and power systems today can give you just as good a performance at light loading as they do at full loading. To be able to deploy that variable infrastructure when the CPU utilization is low, use a UPS in an economy load. This can push efficiency up 3% to 5% over a double-conversion mode. Variable-speed drives on cooling units can reduce energy consumption by a factor of two in the data center environment, and use of high efficiency electronically commutated (EC) fans, can reduce energy consumption as well.
As shown in Figure 1, a recent study conducted by the Ponemon Institute looked at 400 data centers and asked site managers about outages. Ninety-five percent had experienced full or partial outages of some form.
To ensure high availability it’s important to look at the source of the problems. The Ponemon Institute identified UPS battery failures as the number one issue. This tracks closely to data Emerson Network Power maintains in regard to sources of failure for the industry. In fact, Emerson found that nearly 50% of outages are linked to insufficient/bad batteries. Other causes include UPS capacity exceeded; meaning running out of power results in an outage. Human error, such as accidently pushing an emergency “power off” button is another constant concern.
Having strict measures/procedures in place can help. Conduct “pull the plug” tests to make sure the data center operates in the way it’s supposed to function. When you do have those occurrences where the utility fails and something else fails at the same time, you will find out if your system is fault tolerant. When looking at these causes of unplanned outages, the same Ponemon study found that 80% of the outages were avoidable. Don’t fall into that trap. Ensure high availability by taking into account all of the things that could take a potential application down.
Ensuring availability also saves money. According to the Ponemon study, on average when a partial data center outage that includes loss of IT application and utilization occurs, the cost is approximately $250,000. A full outage is $680,000. These dollar figures include the cost to recover from the outage, the replacement of data from the outage, the restart of IT applications, business reputation damage, and payment of service-level agreements (SLAs). External environment is also something to consider when ensuring high availability. The environment you lease should take into account these same costs in the event of an outage so you don’t see these kinds of penalties if you experience a full or partial outage.
Planning for data center infrastructure management (DCIM) is a partnership between IT and facility staffs. It also involves aligning virtual computing, IT infrastructure, and physical infrastructure.
After all, you can’t understand the possibilities within your data center environment if you don’t know what’s currently going on in your data center. Whether you “walk the walk” or have automated systems in place to monitor operations it’s important to understand how your physical and IT infrastructures connect.
For example, if a large data center was undergoing consolidation, management would need to know how much energy each of the applications was going to consume so when they did set up a virtual environment, they could understand the impact on their distribution and overall power architecture. Even though power usage may not be a concern, as virtualization could help free up stranded capacity, excess capacity on individual circuits could be a problem.
When aligning the physical infrastructure with the virtual server infrastructure, start incrementally. Take advantage of the communication cards available on the UPS and cooling equipment. Temperature sensors monitor the current status in the data center, and IT asset and management DCIM software can assist with data center planning and capacity planning as well as additions, moves, and other change definitions.
With the use of approaches like virtualization and cloud computing, data centers will continue to become larger and denser. As this environment continues to shift, the primary challenges in managing these changes will be efficiency, capacity, and availability. The best cloud situations will employ a high availability environment that allows for future planning; will have a flexible infrastructure to make capacity changes almost on the fly, without sacrificing availability and will be able to operate at high efficiency while maintaining the availability of the data center. n