Back to List

Advantages of a Cloud-based Business Continuity Plan

Skyline Technologies  |  
Apr 09, 2019
 
One of the concerns we hear from customers is how much control is given up when moving to the cloud – especially in business continuity. That’s understandable, so we’re going to break down the advantages of an Azure Cloud Cold Site and Hot Site.
 

Azure Cloud Cold Site Advantages

 

Thorough Control Options

With Azure Site Recovery (ASR), you get to choose which servers are replicated, what is involved in each recovery plan, and when that recovery plan is executed. You can choose to automate parts of your recovery plan, or you can choose to keep the overall recovery plan manual and then just automate the steps within that recovery plan.
 

Step-by-Step Processes

Either way, the important point is that the IT team and the business maintain ultimate control when the recovery event occurs. You can create recovery plans within the recovery vault that are an automated step-by-step process when performing a failover. When creating the recovery plans, you can set up manual tasks within that recovery plan. When the recovery plan gets to that manual task, the automation pauses and your admins are alerted that it's time for the manual step. They perform their manual step, log in to the recovery plan, and press continue. You can include those steps to ensure the manual tasks get done so the recovery plan doesn't execute and then you're still not online because there were one or more manual steps that everyone forgot about.
 

Failover Testing

Site recovery within Azure allows us to set up an entirely isolated failover test environment to properly test recovery plans without having any production impact. This one is huge. The more often you test, the better off you are. We recommend testing at least quarterly, ideally monthly.
 

Geo-replicated Storage

You can set up your storage account to be geographically replicated to another location. Microsoft requires that geographic replication must be a minimum of 500 miles away from the primary. For instance, the Azure central US region Azure in Chicago is the sister data center for the south-central US region in Texas. You have this very large geographic span to ensure that your system is protected from any local situations like flooding or fires.
 

Network Requirements

Network requirements can be minimal based on your recovery plans and the change rate of your servers compared to your desired RPO. If you have a 15-minute RPO for your servers that have a high change rate, then your bandwidth requirements are going to be higher. If you have a transactional system that's sending a lot of data and changing a lot of data on disk, and you want a 15-minute RPO for that server, then you’re going to need substantial upload bandwidth heading out of your primary data center to Azure. Compare that to a web server that may not have a lot of disk changes going on: a 30-minute RPO would be just fine, and our required network bandwidth is a lot lower. The good news is that Microsoft provides tools to assess and analyze that information for your servers ahead of time, so we can understand what those requirements will be and allow us to budget and design accordingly.
 

Azure Cloud Hot Site Advantages

 

Traffic Manager

We can create profiles to say, "Is it active or is it passive?" If it's active, do we want it to be 50/50? Do we want the primary site to take 75% of the load and the secondary site to only take 25% of load? We really do have that control. We get to manage what's active, and when and where those recovery events are going to occur.
 

Automation

We can automate the recovery plan to include PowerShell and traffic manager changes to adjust our traffic manager profile when we know a failover event is occurring.
 

Failover Testing

We can still perform our failover testing, and we can choose whether we want that failover test to be production-impacting or not. For instance, if we're running an active-passive and we want to do our failover test, do we want the failover test to actually physically move client traffic to our secondary site? Or do we want that to be hidden and isolated from our clients and just hydrate and run that failover and recovery plan to ensure everything comes online? As a cold site-hot site, we can do failover testing during the workday.
 

Hybrid Connectivity

We have two main options heading into Azure data centers to provide us our network connectivity: site-to-site VPN options and express route options. Both have their pros and cons and associated costs. The main decision here is going to be impacted by our recovery point, recovery time requirements, the amount of changes being made to our servers, the amount of replication traffic, and whether we're doing hot or cold replication. The major benefit of hybrid connectivity is its flexibility. We can mix and match technologies (hot, cold, IaaS, PaaS).
 

Scalable

Using the cloud allows us to grow on our business timeline, not our budget timeline. We pay for what we use. By not having to make any upfront hardware purchases, we’re no longer depreciating assets. There's clear value in everything that we have configured in Azure. The other benefit is that we don’t have to expand hardware and wait months or years to see a return on investment. In the cloud and in Azure, you pay for what you deploy: you start replicating a new server and just pay for that new amount of storage. That's it. It's a huge benefit to utilizing the cloud.
 

Multiple Geographies

Azure is currently in 50 geographies. Where do you want to replicate your data? Do you require backups? If you’re replicating data to 30 different geographies and have retention and archiving, do you need to run backups? It’s worth thinking about. We can now really protect our organizations from any type of disaster because we're running across the globe. We’re no longer limited to running within a single state or even a single country.
 

Automation

There can be some serious hardware and software requirements to enable successful automation for our key initiatives like business continuity. To be able to run scripting on-demand, running what we call run-books or any sort of orchestration, requires systems that can handle doing that. The cloud has that infrastructure pre-built and ready to go.
 

Responsibility Shift

In the traditional world, not only are we managing everything that's running from a business continuity plan, but we have all those contracts and make purchasing decisions. With the cloud, we don’t have to worry about that part or about keeping our business continuity hardware happy. We've got meantime between failure for primary and secondary. Now we're shifting that responsibility of making sure the lights are green to Microsoft (or whatever cloud provider we select). We can really start to innovate within that business continuity plan at a much lower cost from both a cost and time perspective.
 

Time Implementation – Skyline’s Own Experience

With the cloud, there's no hardware to buy and no facility to lease. Communications systems are already online. We're ready to go so our time to implementation is much, much less. At Skyline, we decided to migrate to Azure by shifting our on-premises primary location to it. When you do that, you utilize Azure’s cold site recovery solution.
 

ASR Configuration – 4 hours

Just like most companies our size, we had a decent internet service pipe going out to Azure, so that part was done. When it came time to move to the cloud, we set up ASR in about four hours. We had the site recovery set up, integrated with our system, and connected to our in-house VMware. We had everything talking with each other within four hours, and the replication had begun.
 

First Failover Test – Next Day

Our first replication test was 5 servers. We were able to do our first failover test the very next day. Overnight, the servers that we wanted to test had fully replicated, and the RPO looked green (we selected a 15-minute RPO). We were ready for a failover test. We ran multiple failover tests to ensure that we knew all the manual and automated steps needed to bring that system online with as little user and production impact as possible.
 

Servers Migrated – 15+

In total, we seamlessly migrated over 15 servers. We had over 100 servers on-premises, but we were able to deprecate many of them because of this process.
 

Production Impact – 0 (literally)

We know we talk a lot about Azure, but we really had zero impact. In fact, we have employees that didn't even realize that we moved our infrastructure to Azure.
 

Get the free eBook

This is the fourth of 5 blogs on Where Should the Cloud Fit Into Your Business Continuity Plan? To read them all right now, download our free eBook.
 
cloud business continuity plan ebook
 
Azure

 

Love our Blogs?

Sign up to get notified of new Skyline posts.

 


Related Content


Blog Article
Machine Monitoring IoT Solution with Azure Services and Power BI
Eric SaltzmannEric Saltzmann  |  
Jun 11, 2019
We often hear organizations ask how they can drive more insights out of their connected devices. Though the Internet of Things (IoT) has been a buzzword for the last few years, many organizations are still struggling through the headache of implementing an IoT pilot or solution. Most of the...
Blog Article
Azure Tips & Tricks: Application Insights Snapshot Debugger
Todd TaylorTodd Taylor  |  
May 21, 2019
A painful memory that is burned into my developer-brain is a production support issue for a .NET web API that I wrote years ago. I worked for a large retailer at the time, and the bug was preventing electronic pricing signs from displaying the most up-to-date price for hundreds of products at...
Blog Article
Azure Tips & Tricks: Moving Operations in API Management
Todd TaylorTodd Taylor  |  
May 07, 2019
Azure API Management (APIM) helps developers save a lot of time by doing most of the heavy lifting involved in creating an API gateway and developer portal. However, the APIM administrative UI is missing a few minor times-saving features, such as the ability to move API operations between APIs...
Blog Article
How to Implement a Cloud-based Business Continuity Plan
Skyline Technologies  |  
Apr 23, 2019
First and foremost, you need to clarify your definitions. Make sure your team, your IT team, your business leaders, and your organization understand the definitions of RPO and RTO. Also make sure they understand how those impact their business and their business continuity plan.   Case...
Blog Article
5 Pitfalls of a Traditional Business Continuity Plan
Skyline Technologies  |  
Mar 26, 2019
To get a clear view of where the cloud fits into your business continuity plan, it's helpful to examine the pitfalls of a traditional plan.   1. Cost Business continuity can be extremely costly when taking the traditional approach. You're buying two of everything for production...