Disaster recovery (DR) is a critically important area of security planning that involves policies, tools and procedures designed to enable the recovery or continuation of vital technology infrastructure and systems following a catastrophic event. Organizations prepare for everything from cyber-attacks, equipment failure, and natural disasters with DR plans that outline how to quickly resume mission-critical functions without major losses in revenues or business operations. But how do you choose a DR strategy? And how can you test whether it works effectively without the risk of losing data?
Not all systems will respond the same in the event of a disaster
Organizations with onsite or local data centers and a remote backup data center, may experience a different variation of impairments between the primary and backup networks than organizations that use an off-site data center for geographically diverse redundancy. To truly prepare for disaster, companies must authentically replicate network delays, bandwidth, loss and other impairments found in the diverse networks where data is being replicated, transferred, and stored.
Let’s look at a recent example from a large enterprise who discovered it was taking them 26 hours to do 24 hours’ worth of data backup. The backups had been working perfectly in their lab environment, but with replication over a network to a data center almost 300 miles away, the technology was overcome with latency. Their quick fix was to skip replication on the 12th day to catch up on time and only hope that no data was lost. Preparing in advance would have allowed them to determine in the lab environment how much bandwidth they needed to compensate for latency and other network errors.
Metrics for Disaster Recovery
A good disaster recovery plan is part of a larger business continuity plan which should indicate the key metrics of recovery point objective (RPO) and recovery time objective (RTO) for various business processes (such as the process to run payroll, generate an order, etc.). RPO indicates the maximum time in which recent data might have been permanently lost, and RTO represents the amount of “real time” your business can survive with systems down before it incurs significant risks or losses. The metrics specified for the business processes are then mapped to the underlying IT systems and infrastructure that support those processes.
Some of the most common strategies for data protection are:
- Backing data up to disk on-site and automatically copying it to off-site disk as well, or backups made directly to off-site disk
- Replicating data to an off-site location, often utilizing storage area network (SAN) technology, in which case only the systems need to be restored or synchronized, not the data
- Using private cloud solutions which replicate data into secure, storage domains where data can be restored to virtual machines in the cloud, and accessed from an alternative location in the event of a disaster
- Replicating data to both on-site and off-site data centers using Hybrid Cloud solutions that provide the ability to instantly fail-over to local on-site hardware, but in the event of a physical disaster, can bring up servers in the cloud data centers as well
- Employing high availability systems (often associated with cloud storage) which keep both the data and system replicated off-site, enabling continuous access to systems and data, even after a disaster
Validate your Disaster Recovery plan with emulated networks
It is crucial to test DR efforts before a true disaster strikes but testing DR in a perfect lab environment does not provide the same validity as testing under the real-world conditions into which DR would be deployed. Several factors can impact the stability, behavior, and implementation of DR, particularly the reliability of the network – which is especially vulnerable in the event of a natural disaster. However, building a network to test your DR strategy is rarely practical and often expensive. A cheaper alternative is to use network emulators to mimic different network conditions that would be used in the different DR strategies. By creating real-world conditions in the lab using network emulators, organizations can evaluate each strategy to see which performs best for their business.
Apposite Technologies is a market leader in network emulation for the pre-production testing of applications over wide area networks. To learn more about performance testing Disaster Recovery strategies reach out to us at [email protected]