This webinar will show you everything you need to know about the NCache Bridge feature for WAN replication of cache across data centers.
Here is what this webinar covers:
Today's webinar topic is going to be WAN Replication for a Multi-Datacenter Deployment using NCache. In today's webinar, we're going to cover NCache's bridge feature. Which also includes NCache's bridge topology, the advanced bridge features NCache has, queueing multi-site sessions, as well as bridge performance and debugging monitoring options.
Today we have lined up a very important topic. Specifically, for applications which are deployed in multiple datacenters. These could be for various reasons. For example, you need a DR site, you need active-active multi-data center deployment or it could be east to west migration of data that you need.
So, we have a WAN replication feature available, with the help of our bridge topology and I’ll cover all the details. How to use object caching while you have WAN replication turned on. Use it for active-passive, active-active and multiple active-active, you know, datacenter deployments. So, we have a lot to cover. I believe everybody can see my screen and hear me fine. If I can get quick confirmations through GoTo Meeting questions and answer tab, that would be really good and then we'll quickly get started with the presentation. So, please confirm, if everybody can see our screen and here is fine without any issues.
So, I'll get started with the very basic information about why you need a distributed caching system like NCache? So, typically, it is the application performance and scalability bottleneck that allows the issue, that limits your applications to perform in a faster and then more reliable manner.
Your application tier is very scalable. You can have a web application or a backend application. You can always create a web farm or application server farm, where your application can be deployed on multiple servers. Your load can be distributed. Multiple servers help serve all those application requests in parallel, in combination to one another but all of these applications need to talk to a backend database and that is usually a source of contention. Database becomes a performance, as well as, scalability bottleneck for your application because you cannot scale out database servers and its very expensive resource as well. So, you can always scale up but there's a limit to it how much you can scale up a database server? NoSQL is usually not the answer for it because you need to re-architecture your application. You need to stop using our relational database and start using a NoSQL data source in order to use that and we also have a product called NosDB which is a NoSQL database but we're projecting a different way to handle this and that is through distributed caching system.
So, first of all, solution to this scalability problem is very simple, that you start using an in-memory distributed caching system. It's super fast because it's in-memory, in comparison to disk. So, your performance of your application is going to be improved right away as soon as you plugin NCache.
Secondly, it's a team of servers. It's a cache cluster. It's not just a single source like database. You have multiple servers joined in a cache cluster. So, it's a logical, you know, storage which is pooled by many servers that you can choose to add. That makes it very scalable in comparison to your relational databases. You can start off with 2 servers and you can add more servers at runtime. So, it makes it more and more scalable and as a matter of fact linearly scalable, where you can add more servers and as a result you keep increasing your request handling capacity out of NCache. Nice thing about NCache is that you use it in addition to a back-end database, a relational database. There are a lot of features which complement you use of data that comes from a backend database. So, you can always use NCache in conjunction with your relational database. It's not a replacement of your relational data sources. Some scalability numbers.
NCache is very scalable, as you add more servers NCache allows you to handle more and more requests out of NCache cluster. We recently conducted these tests in our QA environment. We use our AWS lab, where we kept on increasing load and also kept on adding more servers and up to 5 NCache servers, which is a very regular configuration for our distributed cache. We were able to achieve 2 Million requests per second and that was an upward increasing trend where we, whenever we added more servers, we added more capacity to the cache cluster. With an average 1 Kilobyte object size, this is the kind of performance you can also expect out of NCache and with better hardware you can even stretch these numbers and get better performance throughput out of NCache. By the way these benchmarks, there's a whitepaper and a video demonstration published on our website as well. So, you can also take a look at that as well.
Some deployment details. How would a typical deployment of NCache is going to look like.
Here is a single-site deployment of NCache. As you can see, we have a single-site and in your case, what we talk about the WAN replication aspect of it, obviously we would have more than one deployment we would have a separate datacenter, where we would also have NCache and applications deployed.
So, with our distributed cache deployment, as shown in the diagram, let's talk about how a typical deployment looks like. So, we have a team of servers again. We have 4 to 5 servers showing in the diagram, that is where your cache cluster is hosted and as you can see, it sits in between your application and database. The idea here is that you use these sources in combination to one another, for object caching but for session caching, cache becomes your main source of data. So, all your sessions can be stored in NCache and you don't have to go anywhere else. A very flexible deployment model is available. NCache can be hosted on premise. It could be physical or virtual boxes. It could also be in cloud. It could be public or private cloud. It could also be on Azure AWS, because we have marketplace images available for both of these cloud vendors. But, in general, any server which has Windows or Linux and only prerequisite for NCache is .NET or .NET Core Framework. So, these are prereq. It's just .NET and .NET Core which NCache needs as a prereq. If that is available, NCache is very flexible to be deployed on Windows as well as on Linux environments and like I said it could be any environment as well, it could be, you can use Docker and you can also host NCache in Kubernetes cluster and that opens up a lot of other platforms. You can use it in OpenShift. You can use it in Azure Kubernetes service. You know, Elastic Kubernetes service as well. So, all those, you know, platforms are equipped and NCache is equipped to be deployed on all those platforms.
There are two deployment options. One is that you go with the dedicated cache tier, as shown in diagram and the second one is, and in dedicated your applications run on separate boxes and NCache runs on a separate dedicated tier. We also have a shared tier, approach available as well, where you can also run NCache alongside your application boxes. So, wherever your applications are hosted NCache can be hosted on top of it. So, I believe this is pretty straightforward. In a multi datacenter deployment you would have more than one datacenter’s and you would have same deployment for NCache on the other datacenter as well, which we'll cover in upcoming slides and by the way, if there are any questions, you can always post that question in our questions and answer tab and Zack and I will keep, you know, keep an eye out for all those questions which are going to be posted and we'll be very happy to answer all those questions for you. Speaking of questions, since you mentioned it just now, I have one that I’d like to bring out is, well it was very simple you were mentioning Kubernetes now. So, the question was, we're going to talk about bridges and this in general, are there any operating system requirements for all of this? Are you able to run this on Linux? Absolutely. NCache is very flexible. As you can see, even on our deployment diagram. You can see NCache is supported on Windows and Linux servers. So, on Linux servers, you just need .NET Core release of NCache and we have a server as well as client for these. So, if you want to run NCache servers on .NET on Linux using .NET Core that's possible and then your applications can always use our .NET Core release and be deployed on Windows as well as Linux, so, yes. Awesome. I'll let you actually go through the rest of it and I'll bring up the questions later.
So, next we'll talk about Multi-Datacenter deployment of NCache. Now, if your application is deployed on multiple datacenter’s, or it could be that you have one Active site and then we have a Passive site for DR scenarios. For example, the Active site goes down and your application requires that you should be always up and running, if it's a mission critical application, it’s important to your business. Having a downtime on a site level is something that would impact your business.
NCache cluster is designed in such a way that it's already equipped with high availability and data reliability features. So, on a single site level, if one or two servers go down, for example, if you lose a Server, NCache is equipped to handle that outage without any issues. But today we're talking about if we, what happens if we get a site level outage? Or, we need to bring site down for maintenance, entire site being down. So, all the servers are down. NCache is even equipped to handle that scenario and then that's what we're planning to cover today. So, let's talk about why do we need WAN Replication?
Typically, when your applications need high availability, single site can become a single point of failure. If your site goes down, you lose all the data and you can potentially get downtime on your application users and that could impact your business, we've already established that. Multi-Region Apps are slow, if they have to talk to one another across WAN. For example, you have one datacenter deployed, your application deployed in one datacenter which is in US region and then you have another application which is deployed in Europe or any other, for example Asian region. So, in that case, if your application databases are located on one of the datacenters the remote site has to go across network. So, your network speed would impact the latency for that other site. You know, in order to handle that scenario, you usually replicate your data sources across WAN as well and that's what we're recommending for NCache as well, that NCache should be replicated. But, considering that you're, you have a common data source the, remote site has to go across WAN and that could potentially give you a performance impact because data is not local for that site, distance between data centers would also impact your throughput. There is only so much of data that you can transmit between sites. So, that can limit your capacity of request handling.
So, these are two issues if you have multi-regional apps and if both apps are active. Data Replication at request level is expensive as well. For example, you don't replicate entire database and you have data source sitting on one datacenter. Now, a request which goes on our remote location, a geographical location which is remote, to your database. A request level replication for every data, you know, request unit which comes to our data source, that's going to be extremely expensive and that would eat up a lot of bandwidth and resources. So, you need an active mechanism, where you have data locally available and that's why you need WAN Replication of cache needed as a must. So, that your data from one datacenter is replicated across network to the other site.
Some use cases. Why, you know, where exactly you can use WAN Replication?
The most common one, that we come across, is disaster recovery site. You have an active site, that is serving your main business use case. That's where your traffic is being generated and being handled. What if the entire site goes down? You need a fallback option, right. So, that DR site should have data made available already. Otherwise, it would not have that data requirements handled if it has to go back to the site which is already down, right. So, you need data to be made available on the DR site, so, that it's already up and running. You just need to shift your traffic to that DR site. You should not do anything else, just route your traffic to the disaster recovery site and it should work with the same performance value, same performance matrices that you had with the active site. So, 100% data recovery in case of failure is possible, with the help of NCache WAN Replication.
Multi Regional Applications can share data as well as load. Now, with Active-Active sites, if you have one region in US and one in another part of the world, for example, Europe or Asia. If you want that, you know, request from, you know, a datacenter should be handled based on the location affinity, you can achieve that. Now, user from Asia could connect to a site within that region, nearest to that region and they can use the cache over there as well and that cache is in sync with the other cache which is in US region. So, any user which bounces off. For example, you need to manage overflow or you need to distribute capacity. Some of the users need, now need to bounce to US region because Asia region is fully choked, so, you can always do that. So, on a site level, you can load balance your request, based on the capacity that site is handling at that time and at that point in time. Since, cache data is already replicated across datacenters and we'll talk about how to achieve that, so, your multi-regional applications are efficiently able to share their application data and also share the request load and they can have equal load sharing as well. No redundant data migration is done. It's just based on the request bouncing from one datacenter to the other and you can always get that data from the cache which is already connected there.
East to West application data migration is another use case. For example, Asian markets start earlier, than, you know, the Western markets, right. So, your data trend, usually follows from East to West. So, your Eastern site can have our cache set up and with time zone, data moves between datacenter to the Western region and it reaches to the West. So, if you have data replicated across datacenters, the cache data, the Western region would be able to take advantage of all the data, that is made available from the Eastern region. So, you can have East to West data migration made available and the maintenance use case is the third one.
Fourth one, where we can deploy upgrade and maintain without any downtime. That's becoming a very pressing use case, with NCache as well. That, if you're planning to upgrade, you can have upgrades between older and newer versions, using our bridge topology. Where older data, version data can be transmitted to the newer version with live upgrades feature. It could be between sites, for example, you can use one site and replicate data actively to the passive site and you can upgrade, deploy a new code, maintain performance, maintenance on the active site and you have all the data made available and your traffic can be routed to the passive site for that matter. So, both sites can always be up and running without zero downtime and any application data loss.
So, let's talk about how to handle that? The name of the feature is NCache bridge. It's part of the same product. You don't need any separate installation for it. NCache Enterprise is equipped with NCache bridge topology and let's talk about it.
So, our cache, NCache bridge feature allows you to replicate cache across datacenter’s.
It's based on async replication model. It does not incur any performance degradation on the application side. Your cache applications are connected in active to the cache on one datacenter. For example, you have clients here and then you can create a bridge which is also an active-passive queue and that would transmit data to the other sites asynchronously.
So, it's based on async replication, so, there is no performance degradation in replication of data. It's very reliable. It's fault tolerant. It automatically detects connection failures. It automatically reconnects. There are automatic retry options available, so and bridge is also backed up on active-passive queue.
So, there is an Active Bridge server and then there's a Passive Bridge server as well. If Active Bridge server goes down, the Passive would pick up and start all the replication operations without any delays. It's very easy to set up, you don't need any code changes, you don't need any extra installations. It's part of the same product, the Enterprise and it gives its own monitoring and management support, which is integrated into the same NCache Enterprise product and it supports multiple topologies which I’m going to cover next.
So, we have three major topologies.
We have Active-Passive. Where we have an active site and then we have passive site. Passive site is also taking client requests but the data flow is from active to passive. So, if you have DR site requirements, you can use one site to be active, connected to bridge and then you can have other site passive. The active site transmits data to the passive site. So, it's a one-way transmission. The term passive essentially means that passive site is not transmitting data back to active. It is still running and you have client applications take advantage of it. So, it's not something which is stopped by any means. East to west migration can be achieved with active passive. Your maintenance and an upgrade use case can be handled with the help of active-passive.
The active-active topology is, when you have one application deployed on two different geographic locations and you want data from site 1 to be made available on site 2 and site 2’s data made available on site 1. If your application needs data sharing requirements between geographical sites, you can target active-active where you have users active on both datacenters. Clients are connected to both datacenter’s and there is a two-way replication going on between two different sites and then we have 3, 2+ or 3+ active-active topology, where we have one primary bid server, but it's transmitting data to all sites and those sites are also transmitting data back to every other site. So, one update has to be applied on all of the datacenter’s and vice-versa.
So, here is our active-passive.
In this, we have bridge, which is a queue, which is also active-passive. We have cache cluster on site 1, which is just, you know, handling client requests. We have 3 servers here. It's connected to bridge. Bridge also resides on one of the sites. Or in some cases, you can have active bridge on site 1 and passive bridge server on site 2. That’s also possible but we typically recommend that you move bridge on one of the, one of the sites in your deployment architecture. The second site is passive site and again by passive it's still running. It's just that the passive site does not replicate data back to the active. It's one way transmission of data and that's all it means when we say this is a passive site. You can essentially run client applications here and it's fully functional even in this state. So, it's a replication of data, active passive, so, if this server goes down, the passive becomes activated and it's automatic. No code changes are needed. I'll show you how to configure bridge, once we progress to our hands-on portion. So, it's pretty simple.
A question came in and it has to do with this active-passive, it's, primarily if you have an active and a passive site how do you make the passive site activated? Is it a manual process? Is the site stopped? How do you do that? Okay, so, if I’ve understood this question correctly, the passive site in terms of how we activate it? It's already activated. It's running and if we bring this site down or we want to move traffic here, it's your application traffic load that you need to move to this site. So, you have application servers here, you have application servers here, whatever data that you have is going to be transmitted here and this site users can have the data made available from the cache itself. Now, you can always route your traffic to the passive site and you can get all the data made available. So, there are no steps needed in order to make it activated. However, if you want this site to start transmitting data back to the active site as well, you can make it active by using our GUI tools. So, in terms of replication, if you want this to be replicating data back to the active, so, you can always make this active and this is a runtime process. So, you can just with one line of, with one click in the GUI tool you can achieve that or you can use our PowerShell tool to make that happen. But if your question is in regards to making the passive node active. If there is a manual step in order to have client applications connect to it and be able to use data, it's already running. Your applications start using it by if you start routing traffic to this cache cluster. So, within your load balancer. You switch this site off and route all your traffic to the available site, which is up and running already and you can get/take advantage of all the data which is being replicated.
So, active-active, it's again based on same principle. Where we have bridges running on one of the sites.
We have cache 1, cache 2. Both sites are active and even the passive topology can be turned into active, by right clicking and making it active and in this case data from site 1 cache is transmitted to site 2, asynchronously from cache to bridge and from bridge to cache and then similarly, site 2 is also transferring data back to site 1.
3+ active-active datacenter’s, where we have three or plus active-active, where we have one of the sites as bridge server. We can also have a fallback site for bridge. We can have a backup bridge site as well. But, in general we would have one of the sites which would hosting, which would be hosting bridge and then that site is transmitting data to other sites and similarly site 2 is transferring data to site 1 through bridge and to site 3. And for active-active, we have conflict resolution which is time based, so, last update wins. All the data structures that we use are conflict free. These are conflict-free data types. There are no race conditions or any you know data consistency issues because last update is going to be, be applied on the cache cluster across the board. So, NCache manages if there are two updates for the same key come in, NCache would evaluate that and would also allow you to build your own conflict resolution, if that's a requirement. So, it's managed as part of NCache topologies.
So, here's a quick peek into our bridge configurations.
We have NCache bridge config. NCache Bridge is the name and then we have LondonCache environment 1, so, that you can have multiple caches with the same name as well. NewYorkCache and these are connected.
So, let me actually show you all of this in action, how to configure a bridge? How to get started with it and then we'll actually show you object caching and session caching applications. Before you get into that Ron, I had a question come up just on the previous slide with the code and the question is what are the code changes that are involved in order to set up the bridge? Do they need to write any code in order to have the data replicated through the bridge? Not at all. We don't need any code. It's just a configuration. So, you have cache 1 on datacenter 1 and cache 2 on datacenter 2. You simply configure bridge and whatever data that's already being added by your applications in into NCache, is going to be replicated through bridge automatically. So, it's bridge’s responsibility to take charge of all the replication. You don't need to write any code explicitly to have data replicated across datacenter’s and when we say the data types, the conflict resolution, that is something which is also implemented by default, which is time based but if you want to implement your own conflict resolution, if your business requirements are that you evaluate objects, in case multiple updates come in, in that case you can implement that interface. But as far as replication of data is concerned, it's bridge’s responsibility. You don't have to write any code for that.
So, let me quickly get started, I’m going to create a cache.
Let's say, I’m going to name site1cache or let me actually use this right here SiteOneCache. This is just to give you an idea how to quickly get started and be able to create the bridge. I'll keep everything default, because NCache architecture covers all these details.
So, I’m going to quickly go through them. Partition of Replica cache, any cluster. Topology async replication. I'm going to choose 101 and let's see if I can pick 102, if that's available. These are my two servers, to host the bridge. I’ll keep all of this default. Start this and auto start as well. Finish. So, my cache one is on 101 and 102, which is going to be created. Let's see how it goes and then I’ll go ahead and create another cache which would be on separate set of servers and then I’ll host the bridge and show you how this would all work out. Right. So, we have SiteOneCache fully configured. As you can see, it started as well.
Now, I’m going to go ahead and create actually, I’m going to create another cache, which is SiteTwoCache. I think, I can use that. I was playing around with it earlier on. Keep everything simple and this time I’m going to give separate set of servers, so that we represent this as a separate site altogether. Keep everything default and by the way our bridge allows you to have remote management of all sites, from the management & monetary tools allow you to actually manage all the sites along with bridge, from one central location. So, if you have network access. If there's a WAN link available between your SiteOne and SiteTwo, you can essentially manage everything. So, we have SiteTwoCache right here. SiteOneCache right here. 101 102 representing SiteOneCache. 107 and 108 representing SiteTwoCache. Now, and these are started as well.
If I click on statistics you can see there are no objects added here as yet. Data is not added in SiteOneCache or SiteTwoCache, so, we're good. I would simply run this. I think, I have permissions issue to review this counter. I think, I can, okay. So, you can see there are no items available as yet. I'm now going to link these two caches with the help of a bridge, which I will configure next.
So, here we're going to create a bridge.