How DCIM has become critical to business operations
I remember when data centers were seen solely as a utility for the enterprise. They were run as an extension to an existing office building management system where IT equipment racks were treated as another power consumer and heat generator, much like office computers and printers. As such, data center infrastructure used standard HVAC components, systems and controls to maintain the environment to the tight ASHRAE standards in place at that time.
Over time, attitudes about the data center within the enterprise have adjusted and it is now seen as an integral part to business operations and revenue generation. As a result, data center strategies are more attuned with business realities. Their costs—both capital and operational—are highly scrutinized. This scrutiny comes from both internal checks and balances with a company’s finance office, as well as external environmental watchdog groups as greater attention is now paid to the energy consumption.
It is in this new economic reality where some are questioning whether it makes sense to continue designing data centers using office management systems and control strategies. With the average cost of a data center outage nearing US$700k1, improving system availability becomes essential to the bottom line, especially when data centers average two complete outages every two years2. When so much of your business depends on the availability of your data services, a better question is when to move to an industrial solution for your data center operations.
For example, little attention is paid to the robustness of the PLCs and other control hardware design as they’re assumed to be reliable. But, when Amazon’s data center went down a few years back due to a PLC ground fault detection problem3, many took notice. Even though the PLC in question continued to run, it failed to execute a control strategy that caused the service disruption. The importance of the control design within the data center facility cannot be underestimated, and steps to ensure that the controllers are available to run 7×24 is required when considering the high cost of downtime.
What’s needed is a well-designed industrial automation solution incorporating numerous capabilities that typically are not supported by small scale, proprietary systems or by a traditional BMS or PMS. Such an automation system can address both building and electrical equipment together under a single user interface making it easier for operations to understand relationships between the equipment and to correctly react to unexpected conditions to prevent outages. Robust controllers supporting high-speed processors and a vast amount of memory and modular I/O are designed for high system availability with some attaining Technischer Überwachungs-Verein (TUV) safety inspection class ratings. The added reliability of a TUV rated processor coupled with improved performance reduces the possibility of running out of capacity when systems need to expand as the data center footprint grows. Plus, improved reliability reduces maintenance and replacement expenditures.
These controllers use the open standard IEC 61131 configuration language enabling data center owners to maintain the control applications themselves, instead of paying a premium for a technician to make changes on a proprietary BMS or PMS. In addition, both the controllers and the I/O can track events by time stamping events to the millisecond. This can save a great deal of time when doing a root cause analysis. As part of an integrated solution, these controllers automatically provide operational detailed views of the equipment it controls without having to perform any user interface configuration. This reduces the engineering and improves the operational experience with consistent dashboard designs that can reduce or eliminate human errors that result in downtime.
System availability is greatly enhanced through redundancy that can prevent unnecessary data center outages. I/O, controllers, networks, and servers can all be installed with a 1:1 primary/backup pairing where the failure of the primary unit results in an immediate seamless transition to the backup without loss of equipment control or monitoring. In Tier 3 or Tier 4 data centers, such redundancy supports the need to provide on-going operations with complete duplication of the system.
Because data centers have become a critical component of the business enterprise, they can no longer be looked upon as an adjunct to the systems used to manage your office environment. An industrial automation system expands the scope of what can be monitored and controlled, and improves the reliability and uptime of the data center operations.
1 “2013 Cost of Data Center Outages”, Ponemon Institute, December 2013, Page 7
2 “2013 Study on Data Center Outages”, Ponemon Institute, September 2013, Page 10
3 The AWS Team (2014). Summary of the Amazon EC2, Amazon EBS, and Amazon RDS Service Event in the EU West Region. Amazon.com. Retrieved March 18, 2014, from http://aws.amazon.com/message/2329B7/