Our prior installment in this series discussed building a roadmap to using cloud infrastructure services starting with data storage expansion as Phase 1. In today’s installment we get into Phase 2 of our proposed roadmap: using the cloud for disaster recovery (DR) and data protection.
For many mid-sized businesses struggling to maintain existing IT infrastructure, implementing a strong disaster recovery and data protection plan is either financially out of reach or difficult to maintain from a resource perspective. After all, DR planning traditionally involves purchasing redundant infrastructure that is housed off-site (e.g. remote office, co-location facility, hosting provider, etc) and remains underutilized until a disaster occurs, offering little immediate ROI.
However, you don’t need to look very far to prove that lacking a disaster plan can be perilous. Consider the amount of business lost during hurricane Sandy, which hit the northeast US in late 2012. Only a month after the hurricane, financial analysis firm IHS Global Insight estimated that the total lost business activity from Sandy totaled $25B, a staggering sum. While natural disasters are relatively infrequent, they can shut down unprepared businesses for days or weeks and result in substantial revenue losses or even business closure.
In spite of the risks, many organizations simply roll the dice, hoping that they won’t be afflicted by a disaster. They do this not out of a predilection for gambling, but rather because their resource or budgetary constraints make true disaster recovery simply unattainable. For highly regulated industries, however, rolling the dice isn’t even an option. Regulations such as HIPAA in the healthcare industry require businesses to have a disaster recovery plan in order to be compliant. In these industries, a disaster recovery plan is a requirement rather than an option.
Using Cloud for Data Recovery
Data protection is often the cornerstone for DR and many businesses have been moving to the cloud in the form of online backup. Unlike tape, data that is backed up to the cloud stays online and is available for immediate recall, meaning a restore process can be started instantly and discrete bits of data can be recalled immediately. Online backup effectively provides shorter recovery times than recovering data from tapes offsite, as illustrated in the figure below.
However, for many organizations, data recovery is simply one element of maintaining business continuity. Applications must also be restarted once data is recovered in order to get a business operational. If application servers are lost or damaged during an outage, it may take days to reconstruct an application environment. For this reason, many organizations opt for faster recovery times by using cold standby or hot standby disaster recovery sites. Both of these cases require infrastructure for hosting applications in the event of a disaster. In spite of the substantially higher costs, as illustrated in the chart above, alternatives for rapid application recovery have not emerged until recently.
Using Cloud for Application Recovery
Since one of the tenets of cloud is on-demand provisioning of infrastructure, it naturally represents a more efficient way to activate redundant infrastructure for disaster recovery — on-demand and only when needed in the case of a disaster or a disaster test. A pay-as-you-go model substantially reduces costs over dedicated DR infrastructure and eliminates the inherent underutilization. Cloud also alleviates the need for an off-site DR location.
When it comes to application recovery using the cloud there are several approaches, each of which requires careful consideration. Three of these include:
- Recovering applications on virtual servers through a cloud provider’s catalog: Although this recovery process may be viable for small workloads, it can be a time consuming manual process. This is particularly true when attempting to recover tens or hundreds of servers.
- Recovering virtual machines directly in cloud compute: A faster approach than the above involves recovering virtual machines in the cloud, similar to failover of virtual machines between hypervisors. This is possible if the same hypervisor runs on-premise and in the cloud. However, while moving virtual machine (VM) images between like hypervisors is generally straightforward, many cloud providers may not offer sufficient administrative privilege in their virtual compute environments or simply may not be compatible with on-premise hypervisors.
- Importing on-premise virtual machines into the cloud via conversion scripts and tools: The promise of this approach is that it addresses hypervisor incompatibility between on-premise and cloud environments. However, it is important to ensure that conversion scripts and tools operate correctly across all virtual machines, since an import failure during a disaster can be a show-stopper. Also, be sure to confirm that the scripts/tools can operate bidirectionally, meaning they allow a way to eventually fail back virtual machines back to the on-premise environment.
Like any major IT project, DR in the cloud requires a certain degree of planning and also regular testing — but the payoff can be substantial in terms of reducing disaster recovery costs, improving resilience and achieving compliance with regulatory requirements.
Next week, we’ll look at a fourth option that addresses many of the failings of the options outlined above while still delivering on all of the benefits.