Download NCache Now!
Contact Us
Live Chat

NCache: Performance and Scalability Benchmarks

This document describes the performance results of NCache in a distributed environment. Please note that these tests were conducted in a QA environment and not in a professional performance testing lab. This means that you're likely to get better performance than this in your production environment.

Testing Configuration

Operating System: Windows 2008 Server
Architecture: Intel Pentium 4 or later
Number of processors: 2 dual-core processors
Memory: 8GB in each server
NCache version: 3.8 SP1
Network speed: Two 1-Gbit NICs
Client Configuration: Remote clients accessing cache cluster in a LAN. Number of clients increased until max load achieved for the given cache cluster size.
Data size: A string key + 1KB of data

Mirrored Cache Performance

Mirrored Cache is a 2-node active/passive cache cluster. All clients connect to the active node and do their read-write operations against it. And, all updates are asynchronously applies to the passive mirrored node.

Mirrored Cache Benchmarks

Please note that Mirrored Cache can only have two cache servers. Therefore, although it provides excellent performance for both reference and transactional data usage, it does not scale beyond a 2 server cluster. You can use other topologies if you want to scale.

Replicated Cache Performance

Replicated cache is good for reference data use. This is because it scale out really nicely for read-performance but the write-performance actually drops as you grow the cache cluster (as you'll see in the benchmark results).

Replicated cache copies the entire cache to every cache server in the cluster. This means that the storage capacity also does not grow when you add more cache servers to the cluster.

Replicated Cache Benchmarks

As you can see, the read-performance keeps growing by 26000 reads per second as you add more cache servers. But, write-performance drops quite rapidly as you grow cache cluster. That is why we do not recommend more than 2 cache servers in a cluster for Replicated Cache because growing the cache cluster does not gain you anything. There are other topologies (Partitioned and Partition-Replica Cache) that are much better for scaling when it comes to write-performance.

Partitioned Cache Performance

Partitioned cache is an extremely scalable cache for both read-performance and write-performance. Not only does it scale the transaction capacity well (reads and writes), it also grows the storage capacity as you add more servers to your cache cluster.

The only drawback of this topology is that no replication of data is done. So, if any server goes down, you lose one partition of the cache and hence a chunk of the data. But, this may not be an issue if you're using this cache for application data caching where there is a master database. You can always fetch that data from the database and rebuild the cache. There will however be a performance drop while you're doing this.

Partitioned Cache Benchmarks

As you can see, the write-performance is actually slightly better than read-performance but both reads and writes are scaling out in a fairly linear fashion.

Partitioned-Replica Cache Performance

Partitioned-Replica cache is a hybrid between replicated and partitioned caches. The cache still has partitioning but each partition also has a replica on a different cache server. This means that if any cache server goes down and its partition is lost, the replica of that partition becomes active and starts serving the clients without any interruptions.

As a result, you get extreme scalability due to partitioning but also reliability through replication of the partitions.

Partitioned Replica Benchmarks

As you can see, the read-performance is faster than write-performance. This is because all writes are also replicating data to the replicas. But, this replication is much faster than Replicated Cache because it is asynchronous and also because the replica is passive and therefore no sequence-based synchronization is needed like it is in Replicated Cache.

The write-performance also scales in a linear fashion and gives you extremely fast performance.

Partition-Replica (Sync-Replication) Performance

Partition-Replica Cache also provides a sync-replication option in which all updates are made synchronously to both the active partition and its replica (meaning the client application waits until both are updated successfully). This ensures that there is no chance of any data loss in case of the partition going down abruptly which can happen in the async-replication.

However, as a result of this sync-replication, the update performance is not as fast as the async-replication but it is still much faster than Replicated Cache and more important you can scale it in a linear fashion by adding more cache servers to the cluster. Below are the benchmarks:

Partitioned-Replica Sync Benchmarks

As you can see, the read performance is as fast as the asyn-replication but the write-performance is much slower. But, it still scales in a linear fashion. Also, please note that 1-node benchmarks are not included here because there is not replication being done there.