Maintaining Data Center Operations During the COVID-19 Pandemic
Shifting from surviving to thriving
You don’t have to look far to see how IT has fundamentally been upended by the global COVID-19 pandemic. As organizations in all sectors have rapidly emptied their offices and sent their employees home to comply with ever more expansive shelter-in-place and quarantine mandates, replicating the full breadth of services remotely has been IT’s singular priority.
All of this is nothing short of a remote collaboration revolution, and it is already rewriting how work gets done — and how technology gets supported — when direct access to traditional, physical infrastructure is no longer a given.
But this is merely one aspect of IT, and as we begin to digest how these changes will shape technology best practices, both during the current crisis and well into the future, we can’t afford to ignore the often unseen underpinnings of IT infrastructure that don’t have the luxury of working remotely.
Not an Option
Simply put, mission critical facilities like data centers can’t be relocated into employees’ home offices. While transferring end-user productivity out of a traditional office context is a fairly straightforward process, the same can’t be said for the highly specialized workloads that can only be managed within the context of a data center. Beyond the uniquely obvious and nontransferrable capabilities of the facilities themselves — grid access, raw compute power, failover, security, etc. — there is the very real accountability associated with the sheer volume and type of workloads managed within them.
Regulatory constraints around how incalculably vital data must be managed and protected throughout all phases of its lifecycle add even more complexity to data center protocols during a pandemic.
So while you can’t simply abandon your data center in the same manner as your end users have cleared out their offices, you can — and must — understand how to rebalance your provision of data center services in light of how the pandemic continues to evolve. And you must do so while you continue to keep the lights on for stakeholders who need uninterrupted access to data center services now more than ever.
Against this backdrop, if you haven’t already examined your data center management strategy through a COVID-19 lens, now is the time to do so. As with anything related to the data center, however, this will be a complex, multifaceted process, and you should position yourself to navigate it by looking at it through the following contexts.
Capacity Management — The historically unpredictable global business environment is putting unprecedented pressure on capacity management, with businesses barely able to forecast demand — or, in many cases, keep up with it. Global internet traffic is trending upward, with a number of exchanges routinely reaching record throughput as entire economies and workforces adjust to the new lockdown paradigm. Some organizations facing spiking demand have no choice but to move services out of their own data centers and lean more heavily on vendors. This makes absolute sense in an unpredictable landscape where scale needs to be implemented without delay, but it doesn’t make everyday issues like bandwidth, power, CPU, memory, and disk space disappear. Rather, it shifts the burden onto these external providers and their specific infrastructure. IT leadership must adapt these partnerships to keep pace because, if vendors don’t stay ahead of the curve, IT may find itself unable to adequately serve the business.
Connectivity — The old truth to avoid putting all your eggs in one basket has never been more valid than it is now. This issue relates directly to capacity management, and, as the crisis deepens, the strain on all aspects of infrastructure will only increase. Diversify your upstream providers as much as possible to mitigate the risks associated with any one of them being compromised by pandemic-related resourcing constraints. This minimizes the potential for back-end interruptions to reach your customers. Leverage third-party user reviews and analyst resources to better assess and compare vendors, match provider capabilities to fast-changing business needs, and position yourself to make best-of-breed decisions faster.
Disaster Recovery — The uptick in adoption of mission critical services being deployed off-premises doesn’t only impact day-to-day service delivery and the service level agreements (SLAs) that set expectations and confirm accountabilities. It also has significant implications for disaster recovery (DR) planning and implementation because it shifts a fair degree of risk over to the third-party providers now responsible for delivering these services. DR plans must be updated to reflect this new world of vendor-distributed work, and vendors must be integral to this process to ensure they are in position to fulfill all requirements.
Security — Cybercriminals have never missed an opportunity to take advantage of periods of uncertainty to ply their nefarious trade, and the COVID-19 pandemic is no exception. As more organizations move their services to centralized locations, bad actors suddenly have significantly more — and better defined — higher-value targets. From a cybercriminal’s perspective, why attack one company and net only one victim when you can attack a mission critical data center and compromise many victims? This sobering reality reinforces the need to nail down end-to-end security protocols with all vendors, including, but not limited to, encryption, authentication, and on-site access control. Reaffirming your cybersecurity skills inventory — and closing any gaps with targeted training — should also be prioritized.
Colocation — If you are either using or responsible for colocated resources or infrastructure, you must take immediate steps to reduce physical risks at all levels, including:
- Focus on disease control and disinfection throughout the facility.
- Enforce monitoring — including temperature checks — at closely controlled entries, and turn away anyone exhibiting symptoms to avoid compromising the facility itself.
- Reduce the number of people on-site, especially unknowns and other individuals not considered essential to the business.
- Consider extending shift lengths from eight to 12 hours and moving to a two-shift schedule, if local labor laws will accommodate.
- Take special steps to protect technical staff with skills required to maintain data center uptime, including sequestering them in a third, unscheduled shift and holding them in reserve in case primary staff exhibit symptoms.
- Incorporate in-person monitoring of tasks during shift rotations to ensure continuity of operations. Implement contactless handovers to minimize transmission risk during these critical periods.
- Assign operations and technical resources to single buildings and prevent them from moving to other buildings within a larger campus.
- Prioritize the implementation of “smart hands” services to ensure tasks requiring on-site engagement are handled by trained, known resources.
- Leverage guidance from local and regional health authorities to ensure nothing is missed, including physical traffic control methods in shared areas to support social distancing.
Focus on the Opportunity
Not everything about the current pandemic should incite fear — all significant disruptions offer opportunities to rethink how data center operations are planned, managed, and evolved over time. The opportunities can be game changing, but only if you take the time to get out of firefighting mode and zero in on what your strategy should look like once COVID-19 is firmly behind us.
For example, as more data physically moves offsite toward data centers, hardware GPUs can be leveraged for compute-intensive artificial intelligence, machine learning, and related data analysis applications. Recognize that data has gravity and tends to pull surrounding apps with it. Position yourself to sell compute capacity to meet these shifting demands.
Don’t Reinvent the Wheel
As the pandemic continues to play out, expect the value of traditional data center best practice to be reinforced. This isn’t so much a time to rip apart and rebuild as it is to validate what you’ve been doing all along and double down on it.
Start by ensuring your basics are sound and that your existing slate of products and services is solid, secure, and well-communicated to your stakeholders. The sudden increase in demand for data center services and capacity may be unique in history, but stakeholders will depend on you having a firm foundation. By taking the time to reaffirm that this is indeed the case, you’re in a much better position to scale and meet this demand.
Learn from experience
As unique as this experience seems to us all, recognize that we’ve been through this before — including the SARS, H1N1, and Ebola outbreaks in 2003, 2009, and 2014, respectively. Refer back to any documentation you may have from those periods to inform your thinking and responses for the current pandemic, but bear in mind that the impact in those previous cases was significantly smaller, and we “returned to normal” much more quickly.
This time out, the impact is unprecedented, and the future timeline won’t be resolving itself any time soon. Expect it to take far longer than originally expected to return to anything remotely approaching “normal,” and, even then, expect the very definition of the word to evolve.
Many of the economic, technological, and social changes will indeed be permanent, which means your go-forward strategy to manage data center resources should not be to overutilize what you’ve got and hope to ride out the storm. Rather, now is the time to scale your investments in critical infrastructure and prepare for a changed world thereafter — a strategy that will maximize your business continuity and minimize the risks associated with navigating these uncharted times.