NCache Benchmarks for Caching Topologies

NCache is a distributed caching solution designed to improve the scalability and performance of .NET applications. NCache reduces database load and improves response times by offloading data from databases to in-memory caches. This whitepaper presents performance benchmarks for NCache's different caching topologies: Partitioned Cache, Partition-Replica Cache, Mirrored Cache, and Replicated Cache. It is tailored for developers, architects, IT decision-makers, and performance engineers to evaluate how NCache can enhance their applications.

Benchmarking Methodology

To evaluate NCache's performance, benchmarks were conducted in a cloud environment on Microsoft Azure. The following configurations were used:

Hardware Configuration (Azure):

NCache Client (Azure VM):
- 16 GB, 8 vCPUs: D8ls v5
NCache Server (Azure VM):
- ENT-4 (4 GB, 2 vCPUs): D2ls_v5
- ENT-8 (8 GB, 4 vCPUs): D4ls_v5
- ENT-16 (16 GB, 8 vCPUs): D8ls_v5
- ENT-32 (32 GB, 16 vCPUs): D16ls_v5

The underlying hardware performance may vary as the CPU profile changes each time you get a new instance of a VM (e.g., CPU might change from 2.8GHz to 3.4 GHz in the same family of VMs).

Software Configuration:

NCache Version: 5.3 SP5
Object Size: 100 bytes
Read/Write Ratio: 80:20
NCache Client: Windows Server 2019 Data Center
NCache Server: Windows Server 2019 Data Center

Benchmark Metrics:

Request/second: Total number of cache requests handled per second.
Server CPU Utilization: The percentage of CPU usage during the test.
Client VMs: The number of client virtual machines used for load generation.
Instances: The client instances or client application processes running at the Client VMs.

The tests were conducted with pipelining enabled.

NCache Details

Edition Comparison

Download NCache

Partitioned Cache Benchmarks

Test Configuration

In the Partitioned Cache topology, the data is divided among multiple servers, and each server controls a part of the total cache. For the benchmarks, the test environment was also altered in terms of number of servers, VM sizes, and instances.

Benchmark Results:

Servers	VM Size	Req/sec	Server CPU	Client VMs	Instances
2	ENT4 (4 GB, 2 vCPUs)	150,000	90%	4	8
3	ENT4 (4 GB, 2 vCPUs)	225,000	90%	8	16
4	ENT4 (4 GB, 2 vCPUs)	300,000	90%	12	24
5	ENT4 (4 GB, 2 vCPUs)	375,000	90%	16	32
6	ENT4 (4 GB, 2 vCPUs)	450,000	90%	16	32

2	ENT8 (8 GB, 4 vCPUs)	250,000	96%	4	8
3	ENT8 (8 GB, 4 vCPUs)	375,000	96%	8	16
4	ENT8 (8 GB, 4 vCPUs)	500,000	96%	12	24
5	ENT8 (8 GB, 4 vCPUs)	625,000	96%	16	32
6	ENT8 (8 GB, 4 vCPUs)	750,000	96%	16	32

2	ENT16 (16 GB, 8 vCPUs)	550,000	96%	8	24
3	ENT16 (16 GB, 8 vCPUs)	825,000	96%	8	24
4	ENT16 (16 GB, 8 vCPUs)	1,100,000	96%	12	36
5	ENT16 (16 GB, 8 vCPUs)	1,375,000	96%	16	48
6	ENT16 (16 GB, 8 vCPUs)	1,650,000	96%	16	48

2	ENT32 (32 GB, 16 vCPUs)	900,000	96%	8	24
3	ENT32 (32 GB, 16 vCPUs)	1,350,000	96%	8	24
4	ENT32 (32 GB, 16 vCPUs)	1,800,000	96%	12	36
5	ENT32 (32 GB, 16 vCPUs)	2,250,000	96%	16	48
6	ENT32 (32 GB, 16 vCPUs)	2,700,000	96%	16	48

Analysis

The Partitioned Cache topology delivers the fastest performance with linear scalability. Unlike the Partition-Replica Cache or Replicated Cache, it avoids the cost of data replication, allowing for higher throughput and lower CPU usage. This makes it an excellent choice for handling high loads efficiently while keeping resource consumption minimal, which is ideal for large-scale applications that demand maximum speed.

Recommended Use Cases

Partitioned Cache is most effective in high-volume read-heavy applications where data is spread across multiple servers and data partitioning can greatly enhance performance. It is most effective in situations where the application can always reload data from the database if a cache server goes down. But if the application cannot get the data from the database, then a replication-based topology like Partition-Replica Cache or Replicated Cache has to be used for data availability and consistency.

NCache Details

Edition Comparison

Download NCache

Partition-Replica Cache Benchmarks

Test Configuration

The Partition-Replica Cache topology uses a combination of partitioned and replicated data. Each partition has a replica on a different server, ensuring failover protection and high availability.

Benchmark Results:

Servers	VM Size	Req/sec	Server CPU	Client VMs	Instances
2	ENT4 (4 GB, 2 vCPUs)	80,000	90%	4	8
3	ENT4 (4 GB, 2 vCPUs)	120,000	90%	8	16
4	ENT4 (4 GB, 2 vCPUs)	160,000	90%	12	24
5	ENT4 (4 GB, 2 vCPUs)	200,000	90%	16	32
6	ENT4 (4 GB, 2 vCPUs)	240,000	90%	16	32

2	ENT8 (8 GB, 4 vCPUs)	160,000	96%	4	8
3	ENT8 (8 GB, 4 vCPUs)	240,000	96%	8	16
4	ENT8 (8 GB, 4 vCPUs)	320,000	96%	12	24
5	ENT8 (8 GB, 4 vCPUs)	400,000	96%	16	32
6	ENT8 (8 GB, 4 vCPUs)	480,000	96%	16	32

2	ENT16 (16 GB, 8 vCPUs)	360,000	96%	8	24
3	ENT16 (16 GB, 8 vCPUs)	540,000	96%	8	24
4	ENT16 (16 GB, 8 vCPUs)	720,000	96%	12	36
5	ENT16 (16 GB, 8 vCPUs)	900,000	96%	16	48
6	ENT16 (16 GB, 8 vCPUs)	1,080,000	96%	16	48

2	ENT32 (32 GB, 16 vCPUs)	550,000	96%	8	24
3	ENT32 (32 GB, 16 vCPUs)	825,000	96%	8	24
4	ENT32 (32 GB, 16 vCPUs)	1,100,000	96%	12	36
5	ENT32 (32 GB, 16 vCPUs)	1,375,000	96%	16	48
6	ENT32 (32 GB, 16 vCPUs)	1,650,000	96%	16	48

Analysis

The Partition-Replica Cache provides good performance, especially for applications that require high availability and redundancy. By replicating data across multiple servers it minimizes the risk of downtime. However, the overhead of maintaining replicas results in slightly lower performance compared to the Partitioned Cache.

Recommended Use Cases

Partition-Replica Cache is well suited for scenarios where high availability and fault tolerance are essential. It is particularly useful in systems where uptime and redundancy are critical.

NCache Details

Edition Comparison

Download NCache

Mirrored Cache Benchmarks

Test Configuration

In the Mirrored Cache topology, data is mirrored on two servers for redundancy. These guarantees availability even if one server fails.

Benchmark Results:

Servers	VM Size	Req/sec	Server CPU	Client VMs	Instances
2	ENT4 (4 GB, 2 vCPUs)	60,000	90%	4	8
2	ENT8 (8 GB, 4 vCPUs)	120,000	96%	6	12
2	ENT16 (16 GB, 8 vCPUs)	240,000	96%	8	16
2	ENT32 (32 GB, 16 vCPUs)	550,000	96%	8	24

Analysis

The Mirrored Cache provides a good balance between high availability and performance. It maintains multiple copies of data on different servers, which ensures data availability even if one server goes down. However, the additional overhead of mirroring results in lower performance compared to the Partitioned Cache and Partition-Replica Cache.

Recommended Use Cases

The Mirrored Cache is best suited for smaller-scale applications where high availability is crucial and performance overhead is manageable.

NCache Details

Edition Comparison

Download NCache

Replicated Cache Benchmarks

Test Configuration

In the Replicated Cache topology, data is fully replicated across all servers to make sure that all server contains an identical copy of the entire cache.

Benchmark Results:

Servers	VM Size	Req/sec	Server CPU	Client VMs	Instances
2	ENT4 (4 GB, 2 vCPUs)	50,000	90%	4	8
2	ENT8 (8 GB, 4 vCPUs)	72,000	96%	6	12
2	ENT16 (16 GB, 8 vCPUs)	84,000	96%	6	12
2	ENT32 (32 GB, 16 vCPUs)	100,000	96%	8	24
2	ENT16 (Only Reads)	820,000	96%	8	52
2	ENT32 (Only Reads)	975,000	96%	8	64

Analysis

Replicated Cache provides the highest availability since each server maintains a complete copy of the cache. However, the need to synchronize data across multiple servers introduces additional overhead, resulting in lower performance compared to the other topologies.

Recommended Use Cases

Replicated Cache is best suited for applications where absolute data consistency and redundancy are critical. It works well in smaller-scale environments where performance trade-offs are acceptable.

NCache Details

Edition Comparison

Download NCache

Comparative Analysis of Caching Topologies

Partitioned Cache vs. Partition-Replica Cache

A Partitioned Cache delivers better performance compared to a Partition-Replica Cache because it avoids the overhead of data replication. However, if an application cannot afford to lose cached data, a Partition-Replica Cache is the better choice as it provides redundancy through intelligent data replication. This makes it the most commonly used topology because it combines the linear scalability of Partitioned Cache with the reliability of data replication, ensuring high availability and fault tolerance while maintaining performance.

Partitioned Cache vs. Mirrored Cache

Mirrored Cache offers data replication, which a Partitioned Cache does not provide. However, Partitioned Cache is much faster and can scale out by adding more servers, while Mirrored Cache is limited to just two servers and can only grow by upgrading its hardware. This makes Partitioned Cache a better fit for large-scale applications, as it offers both performance and scalability, whereas Mirrored Cache is suitable for applications that prioritize data availability and can function within the two-server limit.

Partitioned Cache vs. Replicated Cache

Partitioned Cache is an ideal choice when an application needs to handle large amounts of data and a high number of transactions. It scales efficiently by distributing data across multiple servers, ensuring smooth performance even under heavy loads. In contrast, Replicated Cache is not suitable for scaling data updates since it replicates the same data across all servers, which can create overhead for write operations. However, it is very efficient for read-only operations because all servers store an identical copy of the data. Additionally, Replicated Cache is particularly useful when two copies of the entire cache need to be maintained, either in the same location or across different locations, for redundancy and high availability.

Partitioned-Replica Cache vs. Mirrored Cache

Partitioned-Replica Cache combines the performance advantages of Partitioned Cache with the redundancy and high availability of data replication. While Mirrored Cache also provides replication, Partitioned-Replica Cache offers better scalability by distributing data across multiple servers, ensuring higher scalability. In contrast, Mirrored Cache operates with only two servers and has limited scalability. Partitioned-Replica Cache is more suitable for high-performance, large-scale systems where data replication and availability are needed without compromising scalability. Mirrored Cache is better for smaller-scale applications where high availability is essential, but scalability is not a major requirement.

Partitioned-Replica Cache vs. Replicated Cache

Both Partitioned-Replica Cache and Replicated Cache ensure data redundancy, but they differ significantly in scalability. Partitioned-Replica Cache distributes data across multiple servers while maintaining replicas for high availability, allowing it to scale efficiently for both data and transaction load In contrast, Replicated Cache replicates the entire dataset across all servers, which limits its scalability, especially for write-heavy applications. However, Replicated Cache is extremely fast for read-only operations.

NCache Details

Edition Comparison

Download NCache

Performance Tuning Recommendations

Optimizing NCache for Different Topologies

Configure the number of servers and memory allocations based on load and expected traffic.
Optimize cache eviction policies and memory usage to ensure maximum throughput.

Hardware and Cloud Optimization

Select the right VM sizes and configurations to handle the load efficiently.

Client-Side Optimizations

Implement client-side caching and optimize connection pooling to improve response times.

NCache Details

Edition Comparison

Download NCache

Conclusion

The performance benchmarks discussed in this whitepaper demonstrate that Partitioned Cache is the fastest and most scalable topology, offering linear scalability for both transaction load and data. It is an ideal choice for high-volume, read-heavy applications that can handle potential cache failures by reloading data from the database when needed. On the other hand, a Partition-Replica Cache strikes a balance between high availability and scalability by ensuring redundancy through data replication. It is the most widely used topology because it delivers the performance of Partitioned Cache while providing the necessary reliability and fault tolerance for mission-critical applications.

Mirrored Cache offers redundancy through data replication, making it a suitable option for smaller-scale applications where high availability is crucial. However, its scalability is limited as it can only operate with two servers and relies on larger server profiles for scaling. Replicated Cache provides redundancy and performs well for read-heavy applications. However, it does not scale well for write-heavy operations, as it replicates the entire cache across all servers. Despite this, it remains beneficial when two copies of the entire cache are required for redundancy, either within the same location or across different locations.

Recommendations for Choosing the Right Topology

In selecting a caching topology it is important to consider the following recommendations depending on the type of your application.

Partition-Replica Cache is the recommended topology for most situations. It is widely used because it combines the linear scalability of Partitioned Cache with the data reliability and high availability offered by replication. This makes it a great choice for most enterprise-grade applications that need both performance and fault tolerance.
Partitioned Cache is ideal for large-scale, high-performance applications where scalability is crucial, and data can be reloaded from the database if a cache server fails.
Mirrored Cache is a good option for applications that require redundancy but can tolerate performance trade-offs, typically in smaller-scale environments.
Replicated Cache is best suited when absolute data redundancy and availability across multiple locations are required, though it has limited scalability for data updates.

Future Considerations

As applications continue to grow in terms of traffic volume and complexity, caching solutions must evolve to handle greater loads, larger data sets, and higher performance demands. Future improvements in NCache can focus on optimizing replication mechanisms to further enhance performance in Replicated Cache and Partition-Replica Cache topologies, making them more efficient for write-heavy applications with minimal performance degradation.

Further scalability improvements could also be explored for Mirrored Cache, data-availability allowing it to scale more effectively while maintaining its data availability features. As the adoption of multi-cloud and hybrid cloud environments grows, future versions of NCache could benefit from improved cross-data center caching, providing better fault tolerance and enhanced performance for geographically distributed applications.

NCache Benchmarks for Caching Topologies

Benchmarking Methodology

Hardware Configuration (Azure):

Software Configuration:

Benchmark Metrics:

Partitioned Cache Benchmarks

Test Configuration

Benchmark Results:

Analysis

Recommended Use Cases

Partition-Replica Cache Benchmarks

Test Configuration

Benchmark Results:

Analysis

Recommended Use Cases

Mirrored Cache Benchmarks

Test Configuration

Benchmark Results:

Analysis

Recommended Use Cases

Replicated Cache Benchmarks

Test Configuration

Benchmark Results:

Analysis

Recommended Use Cases

Comparative Analysis of Caching Topologies

Partitioned Cache vs. Partition-Replica Cache

Partitioned Cache vs. Mirrored Cache

Partitioned Cache vs. Replicated Cache

Partitioned-Replica Cache vs. Mirrored Cache

Partitioned-Replica Cache vs. Replicated Cache

Performance Tuning Recommendations

Optimizing NCache for Different Topologies

Hardware and Cloud Optimization

Client-Side Optimizations

Conclusion

Recommendations for Choosing the Right Topology

Future Considerations

What to Do Next?

Contact Us