NCache Architecture
NCache is an Open Source in-memory distributed cache for .NET, Java, Python, and Node.js. Built for high-transaction workloads, it delivers extreme speed and linear scalability to remove performance bottlenecks and support extreme transaction processing (XTP). NCache also provides intelligent data replication and self-healing, dynamic clustering for high availability.
Dynamic Cluster (High Availability)
Caching Topologies (Linear Scalability)
Dynamic Cluster (High Availability)
NCache has self-healing dynamic cache clustering based on a peer-to-peer architecture to provide 100% uptime. These are TCP-based clusters; every server in the cluster is a peer. This allows you to add or remove any cache server at runtime from the cluster without stopping either the cache or your application.
Peer-to-Peer Cluster (Self-Healing)
NCache cluster has a peer-to-peer cluster architecture. This means there are no master/slave nodes, and each server is a peer. However, the Custer Coordinator is the oldest node in the cluster. If it goes down, the next oldest node automatically becomes the coordinator.
This Cluster Coordinator manages all cluster operations, and other cache configuration information. It also manages cluster health and forcibly removes any cache servers that are only partially connected to the other servers in the cluster.
Dynamic Clustering
As discussed, NCache has a dynamic clustering architecture. This allows NCache to always be up and running even when such changes are being made.
Dynamic clustering allows you to do the following:
-
-
Add/Remove Cache Servers at runtime without stopping the cache or your application.
-
-
Cluster membership is updated at runtime and propagated to all servers in the cluster and all clients connected to the cluster.
Split Brain Cluster Handling
Another part of NCache Dynamic Clustering is its intelligent Split Brain Detection and Recovery capability. Split Brain occurs when, due to networking issues, the connection breaks between cache servers. This causes multiple independent sub-clusters to form, with each one assuming that the others have gone down and that it is the only remaining cluster.
In such cases, when cache servers leave the cluster abruptly, the remaining NCache servers continue attempting to reconnect. But while this is happening, each sub-cluster updates the data independently, creating multiple unsynchronized versions of the data.
After the network issue is resolved, NCache automatically triggers Split Brain. This is done automatically by deciding which cache servers get to keep their data (winners) and which servers must give up their data (losers). This is necessary as the servers in the losing cluster have to rejoin the winning cluster as new nodes. There is some data loss, but the end result is a quick restoration of the cache cluster without any human intervention.
Dynamic Client Connections
NCache also lets you add or remove clients at runtime without stopping the cache or any of its other clients. When you add a client, it only requires a single cache server in the cluster to establish a connection. Once it connects to that server, it receives the necessary cluster membership and caching topology information. After which, it decides which other servers to connect to.
- - Partitioned /Partition-Replica Cache: The client connects to all the Cache server partitions, ( but not Replicas because Replicas only talk to their Partitions). This allows the client to target the appropriate partition for read and write operations. And, if a new server is added to the cluster, the client receives updated cluster membership information and connects to this newly added server as well.
- - Replicated Cache: In case of the Replicated Cache, the client just connects to one cache server in the cluster, but in a load-balanced manner to ensure that all cache servers have an equal number of clients. So, all reads and writes are possible on one cache server. When a new server is added to the cluster, the client obtains this information from the cache server, and reconnects to a new server, if necessary.
- - Mirrored Cache: In case of the Mirrored Cache, the client just connects to the only active node in this 2-node cluster. If the client connects to the passive node, the passive node tells the client about the active node, and the client automatically reconnects to the active node. If the active node goes down and the passive node becomes active, all clients automatically connect to the new active node.
Dynamic Configuration
As mentioned in the Dynamic Client Connections and Dynamic Clustering sections, NCache provides dynamic configuration of the cache and clients. This section aims to educate you about how this runtime configuration works.
-
-
Cache Configuration: When a cache is created through admin tools, this config information is copied to all the cache servers known at that time. Similarly, any new server that is added to the cluster at runtime receives the entire updated cache configuration and copies it to its local disk.
-
-
Hot Apply Config Changes: You can change some of the cache configurations at runtime through "Hot Apply". For example when changing the Cache Size or enabling Compression. In this case, the updated configuration information is propagated to all the cache servers at runtime and saved on their disks. Part of this information is also sent to all the clients, as necessary.
-
-
Distribution Map (Partitioned/Partition-Replica Cache): This is created when a cache starts and is then copied to all the cache servers and clients. This Distribution Map contains information about which buckets (out of a total of 1000 buckets in the clustered cache) are located in which partition.
Connection Failover within Cluster
All cache servers in the cluster are connected through TCP. And, every cache server is connected to all other cache servers in the cluster, including any new servers added at runtime. NCache provides various ways to ensure that any connections within the cluster are kept alive despite connection failure. These failures usually occur because of a network glitch due to routers, firewalls, or an issue with the network card/network driver.
- - Connection Retries: If the connection between two cache servers breaks, NCache automatically attempts multiple retries to re-establish this connection. These retries occur for the duration of the user-specified timeout period.
- - Keep-Alive Heartbeat: NCache also has a feature to have each cache server keep sending some small data packages as a heartbeat to all other servers. This ensures that if there is a network socket issue, the cache servers will detect it and fix it through retries.
- - Partially Connected Servers: In some cases, network issues can split the cluster into sub-clusters (a "Split Brain"). NCache automatically detects and resolves this once the network is restored by deciding which cluster keeps its data and requiring the others to rejoin.
Connection Failover with Clients
Connection failover in clients is similar to cluster failover retries, heartbeats, and auto-reconnect to healthy servers.
-
-
Connection Retries: In terms of failover, connection retries are the same on a cluster and client level. However, if a connection between a client and cache servers breaks, the NCache client automatically does multiple connection retries to establish this connection. These retries occur for the duration of the timeout period. If a connection cannot be established, then an exception is thrown to the client application so it can handle it.
-
-
Keep-Alive Heartbeat: Same as in Connection Failover within Cluster.
-
-
Partially Connected Clients (Partitioned / Partition-Replica Cache): Sometimes, despite retries, the connection isn't restored in time, so the client assumes other servers are unreachable even if they're not. So, in the case of Partitioned/Partition-Replica Cache, it interacts with other servers for reading or writing all data, even though its Distribution Map tells it that the server it cannot talk to has the data. In this case, the other cache server acts as an intermediary to successfully operate.
Caching Topologies (Linear Scalability)
NCache provides a variety of caching topologies to enable linear scalability while maintaining data consistency and reliability. The goal is to support applications from small two-server caches to large clusters with hundreds of servers. A caching topology is essentially a data storage, data replication, and client connections strategy in a clustered cache spanning multiple cache servers.
Reference Data vs. Transactional Data
Reference data doesn't change very frequently, and you cache it to cater to frequent requests and prevent costly database trips, while only occasionally updating it. Transactional data, on the other hand, is data that changes very frequently, and you may update it as frequently as you read it.
In the early days, caches were used mainly for reference data because data that changed frequently would become stale and out of sync with the latest data in the database. However, NCache now provides very powerful features that enable the cache to keep its cached data synchronized with the database.
All of NCache's caching topologies are good for reference data, but a few are particularly beneficial for transactional data. Thus, you need to determine how many reads versus writes you'll be doing to figure out which topology is best for you. Additionally, some caching topologies don't scale as much, especially for updates, so keep that in mind too.
Below is a list of caching topologies along with their impact on reads versus writes.
-
-
Partitioned Cache (No Replication): This is the fastest topology, but it does not replicate data, so there is data loss if a cache server goes down.
-
-
Partition-Replica Cache (Most Popular): This topology is superfast for both reads and writes. However, it also replicates data for reliability without compromising. The best combination of speed/scalability and data reliability.
-
-
Replicated Cache: Very good for smaller environments. Each server keeps a full backup of the cache, ensuring high data reliability and fault tolerance. Superfast and linearly scalable for reads. Moderately fast for writes in a 2-node cluster, but does not scale as you add more servers because writes are synchronously done on all cache servers.
-
-
Mirrored Cache: This topology is very good for smaller environments. It has faster write operations than the Replicated Cache for 2-Node active/passive. But, does not scale beyond this.
-
-
Client Cache: Very good for read-intensive use cases with all caching topologies. Let's you achieve InProc speed with a distributed cache
Partitioned Cache
Partitioned Cache is the fastest and most scalable caching topology for both reads and writes. It is designed for large clusters, it delivers fast reads and writes even under peak loads. However, it does not replicate data. Hence, there is no backup in case a server goes down.
Here are some characteristics of Partitioned Cache.
-
-
Dynamic Partitions: The cache is divided into partitions at runtime, with each cache server having one partition. There are a total of 1000 buckets per clustered cache that are evenly distributed to all partitions. Essentially, adding/removing cache server will result in the creation/deletion of partitions at runtime. The partition bucket assignment does not change when data is being added to the cache. Instead, it only changes when partitions are added or deleted or when data is load balanced. This rebalancing refers to the state transfer process that moves buckets and their data to the target partitions
-
-
Distribution Map: The cache cluster creates a Distribution Map that contains information about which buckets exist in which partitions. The Distribution Map is updated whenever there is a state transfer. The Distribution Map is propagated to all the servers and clients. The clients use this to figure out which cache server to talk to for any read/write operation
-
-
Dynamic Data Balancing: Since all the buckets are HashMap-based and data is stored based on a hashing algorithm applied to the keys. This can cause some buckets to hold more data than others, depending on the keys used. If this imbalance crosses a configurable threshold, NCache automatically shifts buckets around to rebalance this load.
-
-
Clients Connect to ALL Partitions: Clients connect to all the cache servers so they can directly read or write data in one request from the server. If a client's connection with a cache server goes down, then it asks one of the other servers to read or write a cached item that exists on the server that it cannot access. And that server helps the client achieve that.
Partition-Replica Cache
NOTE: Everything mentioned in Partitioned Cache is true here, too.
Just like Partitioned Cache, Partition-Replica Cache is an extremely fast and linearly scalable caching topology for both reads and writes. It is intended for larger cache clusters, and the performance of reads and writes remains very good even under peak loads. On top of this, the Partition-Replica Cache also replicates data. Hence, there is no data loss even if a cache server goes down.
Partition-Replica Cache is our most popular caching topology because it gives you the best of both worlds, performance/linear scalability and data reliability.
Below are some of the characteristics of Partition-Replica Cache.
-
-
Dynamic Partitions: Same as Partitioned Cache.
-
-
Dynamic Replicas: When partitions are created or deleted at runtime, their Replicas are also created or deleted. Replicas are always on a different cache server, and there is only one Replica for a Partition.
-
-
Async Replication: By default, replication from a Partition to its Replica is asynchronous. Client writes (add/update/delete) hit the Partition and are queued for asynchronous, bulk replication to the Replica. This improves performance but has the slight risk of data loss in case a Partition goes down and not all the updates have been replicated to the Replica. But, this is incredibly rare.
-
-
Sync Replication: If your data is very sensitive (e.g., financial data) and you cannot afford to ever have stale data, then you can choose the Sync Replication option in the configuration. When selected, all write operations are synchronously performed on both the Partition and Replica until they're considered completed. This way, if the operation fails on the Replica, it also fails on the Partition. Therefore, it is guaranteed that all data in the cache (in both Partition and Replica) is always consistent. However, this has a performance implication as it is slower than Async Replication.
-
-
Distribution Map: Same as Partitioned Cache.
-
-
Dynamic Data Balancing (Partitions & Replicas): Same as Partitioned Cache. However, in Partition-Replica Cache, data balancing also occurs in the Replicas when the partitions are data balanced.
-
-
Clients Connect to ALL Partitions: Same as Partitioned Cache. However, in Partition-Replica Cache, the clients only talk to partitions and not their replicas. This is because replicas are passive, and the partitions only talk to their replicas when replicating data to them.
Replicated Cache
The Replicated Cache provides data reliability through replication on two or more cache servers. It is very fast and scalable for reads. But it does not scale for writes because they're synchronous to all the servers in the cluster. For a 2-node cluster, the writes are faster than your database but not as fast as a Partition-Replica Cache. For 3 or more server clusters, the write performance degrades and eventually becomes costly.
Below are some of the characteristics of Replicated Cache.
-
-
Dynamic Replicated Nodes: You can add or remove cache servers at runtime to an existing cache without stopping the cache or your application. The newly added server makes a copy (Replica) of the entire cache onto itself. And, the server that is removed updates cluster membership, and all of its clients move to other servers.
-
-
Entire Cache on Each Node: The entire cache is copied to every server in the cluster.
-
-
Reads are Scalable: The reads are superfast and scalable when you add more servers. However, adding more servers does not increase the cache size, as the newly added server is just another copy of the entire cache.
-
-
Writes are Synchronous: Writes are very fast for a 2-Node cluster and faster than your database. But writes are synchronous, meaning each write operation does not complete until all cache servers are updated synchronously. So, writes are not as fast as other topologies.
-
-
Client Connects to One Server Only: Each cache client only connects to one server in the cluster based on a load-balanced algorithm determined by the cache servers. If this cache server goes down, the client connects to the next server in the list. You can also manually specify the server to connect to in the cache configuration file if you don't want to use load balancing.
Mirrored Cache
The Mirrored Cache is a 2-node active/passive cache cluster intended for smaller environments. It provides data reliability through asynchronous replication/mirroring from the active node to the passive node. It is very fast for both reads and writes (in fact, its write operations are faster than Replicated Cache), but does not scale beyond this 2-Node active/passive cluster.
Below are some of the characteristics of Mirrored Cache.
-
-
1 Active & 1 Passive Server: The Mirrored Cache only has two servers. One is active and the other is passive. They both have a copy of the entire cache. If the active server goes down, the passive server automatically becomes active. And, if the previously downed active server comes back up, it is treated as a passive server unless you change this designation through admin tools at runtime.
-
-
Clients' Connections with Failover Support: Each cache client only connects to the active server in the cluster to do their read and write operations. If this active server goes down, all the clients automatically connect to the passive server that has become active by now. This failover support ensures that the Mirrored Cache is always up and running, even if a server goes down.
-
-
Async Mirroring: Any writes done on the active server are asynchronously mirrored/replicated to the passive server. This ensures that the passive server is always synchronized with the latest data in case the active server goes down and the passive server has to become active. Async mirroring also means faster performance because multiple writes are performed as a BULK operation on the passive server.
Client Cache (InProc Speed)
The Client Cache is local to your web/app server and sits very close to your application, and lets you cache data that you're reading from the distributed cache (regardless of caching topology). While being local to your application, a Client Cache is not a stand-alone. Instead, it is always synchronized with the clustered cache. This ensures that data in the client cache is never stale.
You can see this as a "cache on top of a cache", and it improves your application performance and scalability even further. If you use Client Cache in InProc mode, you can achieve InProc speed. NCache provides two variations of Client Cache, the Regular and Full-Data Client Cache. Each is designed to boost performance by reducing network calls while staying synchronized with the clustered cache.
Regular Client Cache
The Regular Client Cache acts as a local (L1) cache on the client machine, keeping a subset of frequently accessed data close to the application. This reduces repeated network trips to the clustered cache (L2). To maintain consistency, the Client Cache stays synchronized with the clustered cache by receiving change notifications. Since it does not hold the complete dataset, all queries are still executed against the clustered cache (L2).
Full-Data Client Cache
The Full Data Client Cache goes a step further by caching entire datasets locally on the client machine. It offers the following benefits and options:
-
-
Caches Complete Datasets Locally: Caches all entries of selected .NET classes, making the complete datasets instantly available.
-
-
Near In-Proc Speed: Full-Data Client Cache provides faster read operations and allows SQL queries to run entirely on the Client Cache, since the complete dataset is present locally.
-
-
Synchronization with Clustered Cache: It remains fully synchronized with the clustered cache to ensure consistency.
-
-
Strict Query Enforcement: This enforcement ensures that queries run only when the complete dataset is locally available. If the dataset is partially loaded, the query fails immediately without falling back to the clustered cache.
-
-
Strict Local Reads: This forces all read operations to be served strictly from the Client Cache. If a key is not found locally, it returns a cache miss instead of fetching it from the clustered cache.
While being local to your application, a Client Cache is not a stand-alone. Instead, it is always synchronized with the clustered cache. This ensures that data in the client cache is never stale.
Below are some of the characteristics of Mirrored Cache.
-
-
Good for Read Intensive Cases: Client Cache is ideal for read-intensive use cases. However, if the number of writes is the same as reads, then the Client Cache is actually slower because a write operation will involve updating the data in two places
-
-
Faster Speed like a Local Cache (InProc / OutProc): A Client Cache either exists inside your application process (InProc mode) or local to your web/app server (OutProc mode). In either case, it boosts your application performance significantly as compared to just fetching this data from the clustered cache. InProc mode lets you cache objects on your "application heap", which gives you "InProc Speed" that cannot be matched by any distributed cache.
-
-
Not a Stand-Alone Cache: A Client Cache may be a Local Cache, but it is not a stand-alone cache. It is synchronized with the clustered cache. What this means is that if another client updates data in the clustered cache that you have in your Client Cache, the clustered cache notifies the Client Cache to update itself with the latest copy of that data. And, this is done asynchronously but immediately.
-
-
Optimistic / Pessimistic Synchronization: By default, Client Cache uses Optimistic Synchronization, which means that the NCache client assumes that whatever data the Client Cache has is the latest copy. If Client Cache doesn't have any data, the Client fetches it from the clustered cache, puts it in the Client Cache, and returns it to the client application. Pessimistic Synchronization means the Cache Client first checks the clustered cache whether it has a newer version of a cached item. If it does, then the client fetches it, puts it in the Client Cache, and returns it to the client application. Otherwise, it returns whatever is in the Client Cache.
-
-
Plug-in without Any Code Change: Utilizing the Client Cache involves no application code change. Instead, you require a simple configuration change.
What to Do Next?