Most critical facilities demand continuous operations for extended periods typically measured in years if not decades. To achieve this challenging goal, these facilities rely on robust and redundant infrastructure, sophisticated monitoring and controls systems, and competent and disciplined facilities operating staff. Providing continuous operations during normal site conditions over the long-term is difficult enough, but even more difficult when you introduce inherently risky scenarios such as major site upgrades, expansions, and renovations as well as allowing for equipment and/or system “major” maintenance and repairs.
One characteristic of many successful critical facility management organizations is the interdependence and integration of the functions and capabilities of the facilities staff and the site building management system (BMS) and other associated monitoring and control systems (including electrical power monitoring systems (EPMS), supervisory control and data acquisition (SCADA), building automation systems (BAS), etc.). Facilities staff consider the BMS as one of the most important tools (if not the most important tool) to managing site operations and maintenance activities. Likewise, the BMS can report indirectly on the overall performance of the facilities staff. When integrated properly, the BMS and facilities staff back-up, supplement, and oversee each other respectively.
One of the most important functions of the facilities staff is to monitor and verify that the BMS and related control systems perform correctly and to initiate manual intervention when control systems fail. A best practice is to stage facilities staff in critical locations where they can observe the automated system actions and be ready to take manual action if required for planned activities such as isolating systems or equipment for routine maintenance. If (and some may say when) the expected sequences-of-operations (SOO) fails to occur, the site staff take manual control of the infrastructure and configure the systems into a “safe” operating condition.
BMS systems and their site-specific deployment vary greatly across facilities in general. Some are designed for basic monitoring and control with only the most critical parameters being monitored/controlled such as equipment on/off status and “summary” alarm monitoring. On the other extreme, BMS systems can acquire massive data from connected equipment and sensors to the point where operating staff can be affected by “information overload.” The best practice isn’t one vs. the other, but a well thought out strategy that matches the site staff resources and capabilities with the deployed system.
An example would be to consider what level of detail the staff requires to manage the operations and maintenance of critical infrastructure. For a site with round-the-clock on-site facilities staff, monitoring the on/off status and general summary alarm of most equipment may be adequate. If an anomaly occurs, the BMS will alert the on-site staff who can immediately respond to the equipment and determine locally if the anomaly is real, what the actual cause and impact are, and take appropriate actions to mitigate and/or resolve the issue.
This same approach is not recommended when the site is not staffed continuously. In this case, remote staff must mobilize and respond to the site to investigate each alarm. Obviously, a site without continuous facilities staffing must also rely on full automation of redundant infrastructure compared to sites with on-site staff who can manually operate system reconfigurations. Consider a site where the normal utility power is interrupted sufficiently long enough to transfer the building to emergency generators. Many sites with continuous staff coverage will not allow the facility to return to utility automatically, and instead choose to manually initiate the transfer only after the site has time to prepare and critical staff are positioned to manage and observe the transfer.
Most industrial and commercial grade monitoring and control systems are extremely reliable and provide years of consistent operations. Regardless, these systems require periodic maintenance including sensor calibrations, software refreshes, and “system admin” type upgrades especially when new equipment gets installed, operating parameters get modified, or changes are required for programmed SOO. When you include the various actuator hardware (valve/damper actuators, breakers/switches, starter contacts, etc.) the potential for control failures increases. When the BMS system needs maintenance or repairs, the site must rely on the facilities staff to monitor and control the site infrastructure.
The capabilities of most BMS systems are also limited. Most sites rely on facilities staff to perform routine “rounds” inspections of critical equipment spaces where the human senses supplement the BMS. Staff who have even modest experience at a given site will still detect changes in background noise (bearing squeal, loose belts, local annunciators, etc.), changes in room odors (burning smell, etc.), changes sensed by touch (vibrations, heat, etc.), and visual changes (burned-out lamps, broken damper blades, leaks, etc.).
Likewise, the BMS can provide some oversight of the overall performance of the site staff. An obvious example would be when the BMS alarms when site staff initiate erroneous actions such as operating the wrong breaker or valve. The BMS can also provide data in support of measuring key performance indicators (KPI) such as determining system and equipment availability, anomaly response times and effectiveness, and calculating energy consumption and efficiencies. Many sites have programmed in pre-formatted reports that track how many BMS alarms exist, how long between alarm acknowledgement and resolution, and repeat anomalies indicative of failure-prone equipment or possibly poor operating or maintenance practices.
It is still widely reported and generally accepted that most critical facility outages and mission impacts are the result, either directly or indirectly, of human activity. In some instances, the failure is a result of human error, but in some instances the failure is a combination of anomalies and unusual circumstances that align in what is referred to as the “perfect storm.” In these situations, the BMS can be a formidable forensic tool that captures the initial conditions, actions taken (by the infrastructure itself, by the BMS, and by the site staff), and can report back the sequence of events along with timestamps and user and device log-ins.
In summary, critical facilities should consider and strategize how to best deploy and manage their facilities staff in conjunction with the site monitoring and control systems. Both are indispensable in meeting the challenge of providing continuous operations. In some cases, the staff supplement the capabilities of the BMS and vice versa, and in other aspects they provide oversight and performance measurement of the other’s effectiveness including intervening when the human or BMS actions fail.