In my previous post, I mentioned there are six keys to disaster preparedness that you should look for in any data center provider: infrastructure design, documented response plans, mock disaster drills, preventative maintenance, clear communication and the right people. This week, I’ll talk about number two on the list: a well-documented response plan.
It is critical that your data center provider has well-documented emergency preparedness and disaster response plans. While similar, both plans should be specific to the geographic location and type of facility. These plans identify actions that will prepare the data center operations team in case of an emergency, including the necessary steps that must be taken before, during and after an event. For example, a prototypical “inclement weather preparedness plan” will specifically address the risks associated with severe weather including tornados, thunderstorms, hurricanes and floods. Your provider’s plan should include specific tasks the operations team should perform at predetermined times leading up to the event – such as arrangements for contractor and supplier support, any changes to staffing levels and hotel reservations if an extended event is expected, among other things. These tasks should be repeated at regular intervals with a final plan in place a minimum of 12 – 24 hours before the event.
Documentation should be easily accessible to customers upon request in both soft and hard copy, contain critical contact information including the provider’s management team and escalation procedures to ensure command and control maintainability throughout the event.
When it comes to your IP network, it is imperative that your data center services provider knows how to react in a disaster scenario, whether a problem is caused by a hardware or network failure. Do they have escalation procedures? An on-call rotation? Who can assist, and how quickly? Unlike data center facility or system issues, where the cause of a problem is often more obvious, network failures often require deeper inspection and detection before troubleshooting can begin. Does your provider have automated tools that monitor network health and routing stability? If so, are response measures taken automatically or do they require human intervention? Are those response measures documented and is everyone aware of them?