There are industries where typical sites have accurate and comprehensive documentation. Some obvious examples are commercial nuclear sites, airline operations and maintenance facilities, pharmaceutical plants, etc. In each case, the site must have a formal document control and retention program. Surprisingly, many critical facilities and data centers approach document management in a less structured, informal manner. The result, in many cases, is that on-site staff struggle to retrieve requested documents; and even then, they are not totally confident that the information is accurate and up-to-date.

It is widely recognized that most data center outages (and/or impact events) are the result of “human error.” What is not so well understood is the root-cause for that human error. Human error can include following incorrect procedures, errors associated with poor or incorrect labeling, or actions taken by outside vendors or contractors based on inaccurate as-built documentation. In other words, competent staff acting responsibly can make human errors due to inaccurate or out-of-date documentation.

All sites have site documents — and most have lots and lots of documents. The key to success is to have a process that maintains critical site documentation that is up-to-date, organized, and readily accessible to authorized personnel. Equally important is the elimination of out-of-date and inaccurate documents from “as-built” status, to be either destroyed or archived as appropriate. This requires a formal document control and retention program supported by corporate policies, program management and oversight, procedures, resources, and trained staff. The effort required to design and implement these programs far exceeds the effort to maintain these programs over time, especially when trying to establish a new program for an existing site.


A formal document management program needs to define the basic questions of who, what, where, when, and how. Who will be responsible for administering the program and who will have access to each document? What documents need to be controlled? Where will the documents be located (both hard copy and softcopy)? When do documents have to be created or received, when should they be reviewed or updated, and when do they get destroyed or archived? And how will they be by organized, categorized, and kept updated?

It is widely recognized that most data center outages (and/or impact events) are the result of
“human error.” What is not so well understood is the root-cause for that human error.

The issue of “who” is somewhat dependent on the size and complexity of the site, how the site is organized, and whether the site staff is direct employees or outsourced. Regardless, there should be one person who has ultimate responsibility and accountability for managing the document control program. This would typically be the senior person with overall responsibility for the facilities management department.

The question of who gets assigned the actual day-to-day administration of the document control program depends upon the size of the site, quantity of documents to be controlled, and available staff. For large sites with ample staff, these duties may reside with a designated administrator coupled with other duties such as opening/closing workorders and PMs, among other administrative duties. For smaller sites with fewer staff, these duties may be assigned to the chief engineer, assistant chief engineer, or even spread across multiple staff. A strategy applicable to sites that predominately outsource the site facility management is to include a formal document control process in these contracted services. One additional “who” to consider is: who will audit the processes to ensure consistent compliance? This is, of course, even more important when the care of critical site documents is outsourced.


A much bigger question is: which documents should be included? There are some obvious candidates, such as record drawings and specifications, standard operating procedures, and “compliance” documents. Other documents that should be controlled could include warranties, SLAs (service level agreements), MOPs (method of procedures), software databases (CMMS, BMS, EPMS, DCiM, etc.), training records, commissioning reports, system test reports, etc.

It is extremely important to balance the criticality of the document/information, the quantity of documents, and the overall effort required. Documents that are unlikely to require revision or change are cataloged and filed once, requiring minimal effort. Documents that change often or are generated in large quantities (such as workorders) will require more effort and diligence. In an age of “information overload,” the question of what to formally control is important. This doesn’t exclude informal document filing and storage. It means that program management and administrators should carefully consider the criticality, use, and volume of documents, thereby ensuring that the program produces the desired results without overtasking available resources.

The question of where to store critical documents is also important. Most companies and organizations today are moving toward using electronic storage for documents. Today, there are many document management software applications available that are designed specifically to provide the framework and structure needed to support formal document management. These programs allow for hierarchal access roles including system administrator (with full rights), document owners (with add/edit/delete rights), authorized users (with checkout and print rights), and limited access users (with read-only rights).

In most cases, there is also a need for hard copies of at least some documents, such as compliance certifications (elevator inspections, backflow preventer and safety relief valve tests, training materials, etc.), operating procedures, MOPs, and other documents that need to be used in the field. One important consideration that applies to both electronic and hard copy critical documents is that backup copies should be maintained in physically separate locations to protect them from loss due to flood, fire, or server failure. Of course, this means a process must be in place that not only keeps the primary source updated, but also purges obsolete documents from the backup copies and replaces them with the updates as well.

As far as when to control documents go, the first step is to establish a document management program while the facility is being programmed — so, before the facility exists. Critical site documents are created even before the facility is constructed, including the owner’s project requirements (OPR) document, basis-of-design (BOD), construction documents (drawings and specifications), etc. During construction, even more documents are created, including submittals, startup/pre-functional tests, acceptance test reports, etc. And as the facility gets completed and readied for “day-1” operation, as-built drawings, standard operating procedures, commissioning reports, and training documents should already be in place. Trying to design and implement a document control system after these documents have been provided leads to errors, omissions, and greater effort than if the system already exists as a repository for documents as they are submitted.

The program should also have policies that define a schedule for documents to be submitted. This could include requirements that record drawings and specification be provided in electronic version within two weeks of completion of a project, and in hard copy version within three weeks. Critical operating procedures should be updated and verified prior to acceptance of any new work. Training documents should be reviewed and updated prior to performing any new hire or remedial training. Another important factor to consider is how long documents should be retained. Some documents should be retained for the life of the facility. Others need to be retained for established durations based on regulatory or code requirements. And some documents should be either destroyed or archived when they are reasonably deemed to be no longer of value or unlikely to be needed (such as obsolete training material). And of course there should be a requirement that the overall program be audited on some basis, including annually, as well as occasional random spot-checks.

The last consideration is “how” will documents be controlled? The first order of business is to define a document identification convention that assigns each document a unique ID number, revision number, and issue date. The importance of this cannot be overstated. A well thought-out convention will make future searches simple and intuitive, especially for electronic storage systems. The document control system should include a master document list that acts as a “Table of Contents” for the program. This should include the document ID, title, revision, and location. It should also include the document owner (entity with responsibility to keep the document updated and accurate), retention period, and any affiliation with a code or regulatory requirement. Considering the vast quantity of site documents to be maintained, it is important to categorize and catalogue documents in an organized and logical manner.

Many sites today have moved away from “O&M manuals,” delivered in large binders by vendors and contractors that are organized by trades (such as electrical, mechanical, plumbing, controls, fire and life-safety, etc.), and instead require documents be compiled into “Systems Operations & Maintenance Manuals” (or SOMMs) that organize documents by system. SOMMs typically use a consistent organization and format for each system including a narrative describing the system and its purpose, single-line diagram, approved submittals, standard operating procedures, baseline performance data and commissioning reports, parts lists, warranties, vendor contact numbers, etc.

As the discussion above demonstrates, a formal document management program can be complex and difficult to design and implement. The required effort gets even more difficult when trying to establish these programs in existing facilities where critical documents may have been lost, misplaced, or are unverified. Some sites have embraced bar-coding, building information management systems, or BIMs, coupled with palm devices to make critical documents readily available to field staff at the touch of a button (or screen). The effort required to deploy a new program will greatly exceed the effort to maintain the program over time, but without oversight and discipline, even the best system will eventually fall short. For most facilities (and especially existing facilities) it may make sense to engage an outside contractor to support the initial program design, creation of program policies and conventions, and the selection and deployment of software applications and tools — preferably a contractor who has expertise and experience in document control and retention tools.