The critical facilities industry uses service level agreements, or SLAs, extensively. SLAs vary greatly and are used both internally within corporations, between landlords and tenants—especially within colocation (“colo”) facilities—and between owners and outside service providers. When written and implemented correctly, SLAs are excellent tools for establishing performance standards and measurements to keep critical operations on course. When written poorly, they can waste valuable resources. And if not implemented properly, they can result in a false sense of security and ugly outcomes.

Typically, SLAs define a level of performance without specifying how the performance is met. For example, IT and facilities management (FM) can agree that the data center will maintain server inlet conditions within the ASHRAE Thermal Guidelines at all times. IT doesn’t dictate how FM accomplishes this, but the SLA should describe how compliance is measured and verified. FM might also establish an SLA with a rental, roll-up chiller provider. Where the supplier obtains the chiller or how he transports it is not specified, just that it arrives on-site and on time.

A well-written SLA should be structured and detailed and include the following:

  • Clear description of the service being delivered
  • Clear demarcation of the responsibilities of each party (customer and provider)
  • Service level objectives (SLOs) that describe how well the service will be delivered
  • Quality of service (QoMs) measures that define how the provided service gets measured
  • Billing details, including penalties and rewards (if any)
  • Performance period
  • Performance review cycle

Using a rental chiller scenario as an example, here are some details to include in an SLA.

Clear description of service to be provided. The SLA needs to describe the exact capacity and capability of the required chiller. It must include how the chiller will be set up, what controls are required, whether a chilled water pump is necessary or will it be connected to the facility’s chilled water pumps, and what power connections are required or if it needs to include a generator, etc.

Clear demarcation of each party’s responsibilities. After the chiller arrives on site, then what? Who is responsible for making the connections (power, piping, and possibly monitoring and controls) to the existing infrastructure? Who will startup and operate the chiller? If a generator is included, who is responsible for monitoring fuel oil level and refilling? Who provides the required pipe/hose, power cables, and associated connection fittings?

Service level objectives that describe how well the service will be delivered. Assuming a four-hour response time, when does the clock start and when does it stop? It probably starts when the provider answers the phone call and initial request. But when does the clock stop? When the chiller arrives on site? When the chiller is parked? When the chiller is connected to the chilled water taps? When the chiller is started? What if the chiller runs for five minutes and then trips on a safety due to improper settings?

The chiller capacity and performance should also be defined clearly. If the chiller operates and meets the load demand but at an elevated supply water temperature and higher than expected differential temperature, does it meet the SLA requirements or not? What if the chiller supplied is oversized and cycles on/off due to low load conditions and the resulting thermal transients cause occasional elevated chilled water temperatures? What if the chiller’s weight exceeds the load limit for the pavement or location where it was to be placed? Who is responsible for any subsequent damages?

QoMs that define how the provided service gets measured. QoMs are especially appropriate for high-volume services. For example, a FM SLA would include how many maintenance tickets get closed on schedule (90 percent or more), completing branch-circuit installs (within two days of IT request), or completing routine maintenance tasks within the approved maintenance window. So what about our rental chiller scenario? Some obvious QoMs are of course response time for the delivery, connection, and startup of the chiller. And the chiller should provide stable, continuous operation, but for how long?

QoMs should also define how the customer can enforce the agreement, including appropriate actions to mitigate failed performance. If the requested chiller fails to arrive, can the customer call another provider and back-charge the costs? If the customer’s monitoring sensors differ from those local to the chiller; whose data become the official values for enforcing the agreement? Can the customer simulate a failure and run a drill to verify the provider’s readiness?

Billing details including penalties and rewards (if any). The SLA needs to include specific details on how services will be paid for and how performance issues affect compensation. Compensation should be structured to account for all possible outcomes. There could be a monthly charge that guarantees chiller availability and a supplemental charge for a scheduled chiller call-out—e.g., if a permanent chiller was being taken out of service for routine maintenance. There might be a separate supplemental charge for an emergency call-out.

The agreement on compensation should also include language on how failures to perform get addressed. What happens if an emergency response call is made and there are no suitable chillers available? What if the chiller arrives late? What if it arrives on time but has operational problems? Depending upon the criticality of the need, the SLA could include financial penalties for failure to perform. These would escalate depending upon the severity of the performance issue.

Performance period. The performance period or length of the SLA contract should be specified. As with many service type contracts these typically are for a year or more with language on how either party can break the contract and options to renew.

Performance review cycle. A very important aspect of SLAs is that they include periodic review sessions where the two parties meet to discuss the status of the agreement, the past performance history, and any lessons that can be learned to improve the overall relationship.

Carefully written and properly executed SLAs enable mission critical facilities to realize optimal results at the lowest possible cost. Paying close attention to the complex details of any given scenario will go a long way towards achieving an organization’s ultimate goals and objectives.