Ongoing Operations has worked with hundreds of credit unions and seen numerous disaster recovery strategies brought to their knees by poor connectivity choices. In many ways, we would argue the most important technology decision you will make in your IT Disaster Recovery plan involves connectivity. Almost all credit unions require the following items to be accounted for:
· Branch Connectivity
· Internet Connectivity
· Voice Connectivity
· Third Party Connections (ATM, Federal Reserve, Online Banking, Credit Card etc.)
The reality though is that unless you have specifically designed your entire network to handle full failures of some components – it is extremely difficult to recover efficiently in a disaster. When done right however it can take almost no time at all and be an easy win in the recovery effort. The first question you should ask yourself is – how quickly do I need to be able to access my key systems today and how quickly will I need to access them tomorrow. If the answer is anything lower than 12 – 24 hours you should probably consider spending some engineering time and resource time to architect your network correctly to handle failover scenarios.
Who is your ISP or who should it be?
The first and most important issue to look at here is your ISP. The ISP for most credit unions is their telecom carrier. In our opinion this is a mistake. The telecom carrier in most cases is really only responsible for your local connectivity. They have almost no visibility into the business applications and of course if you have ever tried to get ahold of a major carrier on a weekend…they aren’t so great in emergency situations. Oddly enough, very few credit unions have the tools or resources to handle both the engineering challenges or the provisioning and managing of their own DNS or public IP space. In some ways you can think of it like the old days before you could take your cell phone number with you. If you didn’t like Verizon, well you better get a new phone number. However, taking the time to partner with someone like OGO, or provision your own public IP/DNS space puts you in the driver seat of your network. Once there, rerouting critical traffic becomes much easier and quicker.
How do you want your branches to work in a disaster?
Aside from the ISP portion of the problem, branch communication is key as well. In most cases branch networks are designed to route traffic from the branches to the main office and then to the internet. Even with the Disaster Recovery or Hot Site on the credit unions main network, traffic won’t just route without some work. The main location often times has a primary router with routing tables that tell servers, workstations, and other components where to go to get data. Without setting up alternative paths up front this detour routing must be done at time of disaster. Often times, nobody can find a network diagram or other components to make the process smoother. Instead, if you setup the primary and backup routing on day one and tie it into the same systems used to manage the ISP and public IP space, you can reroute the traffic almost instantly. This challenge is getting bigger each day as more and more credit unions have multiple VLANs used to segregate traffic for data, internet, voice, and other components in an attempt to make our networks more secure.
How do you want your network to work in an attack?
With things like DDoS attacks and more and more IT security threats, designing your network to provide you options is key. Just like being on a two lane road with no options isn’t great if there is an accident, the same is true in your network architecture. The strategy is similar to the disaster architecture in that you want to make sure you decouple your ISP and Branch routing from your telecom provider and build the appropriate backup plans today. That way, when an attack hits you will be well prepared to reroute and maintain services quickly. This becomes even more critical in hacktivism attempts as the potential disrupter is smart and will adapt to your mitigation strategies. Without the right tools and architecture you can be down for hours or days just trying to figure out what is happening.
How quickly do you want to get your third parties back up?
In my opinion the most important things to get working in disaster are those services that are self service for the member. Think of credit cards, debit cards, atms, online banking etc. The quicker you can recover and make those services work as normal, the quicker members can return to normal and perceive less disruption. The down side of preparing for recovering these third parties is that you often need to buy two of every connection and they are often expensive connections. The same branch routing and internet IP flexibility discussed above apply to all the needed 3rd party connection also. Taking the time to address and design the right solution is key for quick recovery. OGO has a product called Connect 4 that aggregates these connections for many credit unions and many providers and greatly lowers the price. In the disaster, OGO will reroute the traffic and bring up the connections so that you can focus on member service.
So if you could take away one thing from this article, I would focus on the need to decouple your ISP and branch routing from your telecom provider and integrate it with either your own tools and resources or with your disaster recovery providers. That way, when a disruption strikes, you can reroute things quickly and efficiently without waiting on a disinterested service provider.