Reliable infrastructure is the lifeblood of a data center and reliability issues can easily risk the data center operation. A loss of power or cooling can significantly impact a number of critical services including the revenue generating services, bring unexpected costs, result in non-compliance with the service level agreements and increase the risk of litigation and negative media. The most reliable data centers are run by those who have their finger on the pulse of their critical infrastructure equipment and systems, identifying problems before they negatively affect their operations.  

The data center risk assessment is a tool to identify and prioritize unseen issues, solving and documenting them in order to mitigate risks associated with data center downtime. The assessment provides a road map of the risks associated with data centers electrical, mechanical, security, communications and data center power and cooling systems.

Data Center Risk Assessment Scope

The site assessment scope includes but is not limited to the following:

  • Site location: Overall site assessment focusing on evaluating risks associated with the location of the data site and its proximity to the airport, railways and interstates, hazardous storage and flood plain.
  • Building: Assessment of the architectural features focusing on visual inspection of building shell, interior walls, building envelope, plenum space and other critical areas. 
  • Critical systems: Assessment of critical mechanical and electrical systems to determine reliability and any potential vulnerability. The assessment includes evaluation of equipment age, condition, capacity, energy efficiency, power and cooling distribution, Tier level and maintainability. The assessment identifies any single points of failure and proposes solutions to eliminate them.
  • Fire protection: Assessment of the fire protection system used in the data center focusing on type and adequacy of the fire protection system.
  • Communications: Assessment of the communications systems focusing on inspection of diverse fiber entrances, number of feeds, cable routes, diverse cable routes, number of demarks, carriers, diverse FDPs within the facility, stress relief, use of inner duct, etc.
  • Security: Assessment of the security systems focusing on evaluating access controls, camera locations, monitoring system type and points, and escort system.

Steps in Data Center Risk Assessment

The first step in the risk assessment is to identify and evaluate risks. The risk evaluation includes assessment of the site, critical infrastructure, support infrastructure such as fire protection, security, etc., evaluation of existing equipment as it pertains to age, serviceability, capacity, etc., identification of single points of failure, determining the probability of occurrence for individual risks and recommending corrective actions to mitigate the risks.

The next step is to develop a plan for mitigating the risks. This includes developing policies and procedures for infrastructure upgrades and improvements, planning for shut-downs or providing back-up infrastructure to maintain uptime and developing emergency action plan and schedule that aligns with organizational goals. Once the plan is reviewed and approved by appropriate personnel, the plan is executed and upgrades and improvements are implemented as per the plan. The results of the execution are reviewed on a regular basis and policies and procedures are refined as needed to ensure continuous alignment with the organizational needs.

As discussed previously, the purpose of the risk assessment is to assess the overall site, reliability, critical electrical and mechanical infrastructure, fire protection, communications and security. The assessment identifies single points of failure in the critical electrical and mechanical infrastructure and provides recommendations and estimated costs to mitigate the risks. The assessment report delivers a clear understanding of the capabilities of the existing data center site and can be used as a performance improvement plan to optimize future capital and operational expenditures.