I have had discussions with clients, contractors, consultants, and pretty much everyone else at one time or another regarding the benefits of providing both continuity and consistency of services. Continuity and consistency are generally complementary, but are obviously not the same, and are not always necessarily complementary. What unites the two concepts is the pursuit of reliable processes that produce expected results — namely continuous operations.
In manufacturing, assembly-line production, and in many construction processes, consistency in execution tends to result in fewer defects and higher quality. The relationship between consistently performing repeatable actions and achieving quality results has been studied and refined to the point of becoming a science. An engineer at Motorola introduced a formal continuous process improvement methodology in 1986 that is known as “Six Sigma.”
Six Sigma was originally focused on reducing the quantity of defects produced during manufacturing processes which are inherently repeatable. The name Six Sigma was derived from the overarching goal of producing acceptable results in what statistically amounts to as 99.99966% of the outcomes.
The methodology was modified when applied to “business processes,” which are by their nature less repeatable. Both processes are based on formal iterative methodologies that include a definition of the desired goals, measurement of the current process’s results, and an analysis of how well the results meet the defined goals. Both then develop action plans to improve the current process and a feedback loop. Many quality assurance and quality control experts feel the degree of success of Six Sigma programs is greatly influenced by how repeatable the “process” is. This is why Six Sigma is excellent for manufacturing and assembly line work, and is typically less beneficial for business processes and other applications where the desired outcome is something other than identical widgets.
The mission critical industry measures reliability in terms of percentage of “uptime” over the life of the facility. The most critical facilities strive to achieve as much as 99.9999% uptime which interestingly is the equivalent of Six Sigma. So, is the formal Six Sigma methodology applicable to the design and operation of critical facilities? My opinion is a qualified yes and no.
Critical facilities, including data centers, are obviously neither manufacturing nor assembly line facilities. That’s not to say that there aren’t aspects of assembly line, repeatable work such as the workflow processes associates with the receiving, deployment, configuration, and maintenance of IT equipment, or routine operations and maintenance of critical infrastructure that should be scripted through standard operating procedures. But generally speaking there are a broad range of IT devices, equipment, and supporting racks, cabinets, chassis, and configurations that require a diverse and flexible approach to the actual steps needed for successful deployment. And likewise, the actual operations and maintenance of infrastructure require a diverse and flexible approach based on the current operating scenario or maintenance demands and equipment condition. So trying to directly apply the Six Sigma methodology to these processes would probably expend a significant effort with meager success.
On the other hand, there are definitely aspects of the Six Sigma methodology that the critical facility industry can benefit from. First and foremost is to recognize that you can’t leave much to chance to achieve continuous uptime. Executive management needs to champion and formalize quality assurance as an integral part of how a facility is managed, operated, and maintained. Quality control processes need to not only exist, but be followed and enforced in a disciplined manner such as “change control” processes, employee training and certification processes, employee and vendor “on-boarding” processes, and work verification and acceptance processes. Perhaps one of the most valuable quality assurance programs is the implementation of formal, third-party commissioning of all installations and construction, including both new work and modifications to existing computer rooms and supporting infrastructure.
Another aspect is to identify which tasks and activities are truly repeatable and to develop processes and procedures that when followed will produce consistent results. These processes should then be formalized and documented to include what the overall goal is and to establish what constitutes acceptable results, and a means to measure the outcome as either compliant or non-compliant (i.e., a defect). The use of flowcharts to “map” processes can be very useful especially when identifying what can and should be measured and assessing which steps are prone to human errors or mistakes.
A key to optimizing the quality assurance program is to be judicious in where and how to apply available resources in performing quality control functions. Not all industry “best practices” apply equally to all business models and vice versa. Some large enterprises use very large quantities of identical equipment deployed in identical racks, rows, and “pods” where IT equipment deployment as well as the installation of supporting infrastructure becomes extremely repeatable. The routine fit-out and startup of these deployments can benefit greatly from clearly defined processes backed by continuous process improvement programs similar to Six Sigma. Other sites require more diverse and customized IT deployments to support a variety of business units or other IT-based functions. In these instances quality control should also be somewhat customized, but still incorporate a means to validate that the work performed is compliant with the established workflow process.
When it comes to facilities management, generally there is a benefit to maintaining continuity in service providers by establishing long-term relationships both internally and externally. IT management should develop processes and procedures that include communications with their facilities management counterparts and vice versa. There should be formal processes for defining, overseeing, and verifying the work performed internally as well as by outside vendors and contractors. When these processes are well defined, documented, and enforced it will benefit all by minimizing mistakes and errors, which inevitably result in wasted effort, rework, and impact to budgets and schedules. In cases where employees or contractors are replaced, the new employees and/or contractors will have a “map” as well as quality controls that ensure consistency in services provided which result in improved site reliability.
So when is consistency or continuity not beneficial? Most critical facilities, and data centers in particular, change over time as their missions evolve or as IT technology progresses. Most of the time these changes and evolutions occur slowly over a long period, but occasionally drastic changes occur such as deploying high or ultra-high density IT equipment, adding hot/cold aisle containment systems, or installing water/liquid cooled electronics. The changes can also be of a bureaucratic nature such as new regulations, auditory requirements, or service-level agreements. Scenarios such as these should prompt a comprehensive review and reassessment of the existing processes, training, and associated quality controls.
As for continuity, when measured performance indicates complacency or service degradation, then it is time for a change. The saying “because that’s the way we have always done it” typically indicates a culture where innovation and process improvement are squelched. In fast moving and constantly evolving facilities typical of most data centers this can eventually result in situations where consistency actually reduces reliability.