The quest for carbon reduction in data centers is starting to heat up. In the past, the task of power savings (with the happy byproduct of carbon savings) fell mainly on the facilities teams. The functions of capacity planning, resource usage (on and off the floor), disaster recovery, and daily operations lie with data center operations but predominantly address facility needs. While server teams and networking teams work to support applications, they may or may not have an eye toward sustainability and power savings. Data center infrastructure management (DCIM) became the tool of choice, but several organizations struggle with the time requirements to correctly populate and maintain those systems.
Facilities and engineering personnel design and model based on the needs of active electronics, but who is working to ensure the active components are the right choice? The reliance on vendor information regarding capabilities and power requirements to assist in product decisions can be flawed. Manufacturers and OEMs work through typical configurations, which may or may not be equal to the designed configuration installed. Different CPUs, core numbers, threads, memory configurations, and power supplies create variances in power consumption demands. Power requirements are as unique as the configuration options available. Real-time monitoring is some help, but certainly not at the project's design and hardware selection phase.
The first thing to know is that there is no standard configuration for testing. The aforementioned is true for almost any data center component — from UPSs to servers and everything in between. For example, safety testing is just that: a test to ensure the safety of an apparatus. Likewise, performance testing will test that the product works as designed for performance specs. Testing is an expense in terms of both money and time. When a product does not perform as expected, redesign, fabrication, internal testing, and external verification of those tests restart. These cycles can be lengthy and are often pushed back further due to scheduling conflicts with the testing labs. Manufacturers work to put their best product forward for testing, but in reality, testing every configuration is simply not cost-effective given personnel, time, and monetary constraints. Lastly, the modeled power will fluctuate with loads and demand on the compute infrastructure.
While some end-user companies have the bandwidth and means to test multiple platforms in their dev/ops labs, many do not and, instead, rely on salespeople and subject matter experts to help with purchasing decisions. While there is nothing wrong with trusting your vendors, it is essential to note that they don't know the ins and outs of other vendors’ equipment.
Independent testing and AI
Most manufacturers have online tools to help users model hardware and provide power guidance. But, again, we are back to “modeled” as opposed to actual power requirements. The difference between assigned power and realized power is stranded power. Stranded power is the power that is allocated but not used. Overall, the data center industry estimates stranded power to be between 60% and 65% of total facility power. While server power demands don’t account for all of this, they contribute to the waste, especially if the modeled power is significantly greater than the utilized power. From an IT perspective, servers represent roughly 70% of the IT power draw in a data center. Even in a cloud environment where companies size their equipment, there is potential waste if the servers are incorrectly sized.
In an on-premises environment, when a server goes away, there are savings to the user that go beyond costs. One less server generally means:
- Lower power.
- Lower cooling/heat rejection demand.
- Two fewer switch ports.
- Two less UPS/strip power needs.
- Less rack space.
- Fewer cabinets.
- Lower pathway needs.
- Fewer copper and fiber connections.
- Decreased licensing costs.
- Fewer risk points.
- Decreased maintenance costs.
- Decreased maintenance labor.
- Reduced carbon burden for Scope 1, Scope 2, and Scope 3 emissions.
- IT asset cost recoup.
In the cloud, understanding your server needs rather than randomly picking servers and assuming hardware requirements, enterprises can make right-size decisions factually.
Savings increase when multiplied across the server estate. Fewer servers in the estate decrease the risk profile, and the carbon burden decreases. Scope 1, Scope 2, and Scope 3 emissions define the carbon burden reporting from the U.S. Environmental Protection Agency (EPA). These categories form A-1 40CFR Part 98 combine as CO2e (carbon dioxide equivalents), which includes all greenhouse gases (GHGs). Scope 1 emissions are direct GHG emissions from sources controlled and owned by an entity. Scope 2 emissions cover the indirect emissions from purchased energy sources. Scope 3 emissions are all other indirect emissions within a company’s entire value chain, as noted below.
- Scope 1 Direct Emissions (owned assets) include facilities, equipment, vehicles, and on-site landfills.
- Scope 2 Indirect Emissions (purchased) include purchased electricity, heating, and cooling.
- Scope 3 Indirect Emissions (third party) include transport, distribution, waste, energy and fuel, leased assets, and travel.
With AI and machine learning (ML), it is now possible to compare brands and models to find the best servers for an environment, either within a particular manufacturer’s portfolio or across multiple manufacturers' offerings. Carbon emissions can be decreased with the server footprints while rightsizing environments.
Health care system A had an original environment with 2,700 servers across its virtualized platforms and utilized full market intelligence to determine the same load in an optimized server estate over three years. he results are shown in Figure 1.
The comparison across manufacturers in this example yields an 82% savings in overall energy in the configuration optimized for energy, utilizing 383 servers as opposed to 2,741. In the cost-optimized model, the key focus was on equipment savings while also investigating power and carbon savings. That model sees the number of servers decrease to 692 with an overall carbon and energy savings of 77%. It is imperative to note that these are not modeled power numbers but, rather, power was assessed in the lab using actual servers with varied configurations.
With the database built for the various servers and configurations, the modeling is done off-floor without access to the network or servers. The models can be run in multiple scenarios over the entire server estate, or servers can be placed in groups for modeling based on function. The point is that the number of options is nearly limitless. That level of information and intelligence is invaluable and just one example of how AI/ML bring servers into cost savings, energy savings, and carbon reduction strategies using accurate (not modeled) data. These tools can reduce carbon emissions and drastically reduce running costs by putting the focus on servers instead of the entire facility.