Cache Startup Loader and Refresher Properties and Overview
This feature is only available in NCache Enterprise Edition.
NCache provides a cache startup loader to pre-load the cache with useful data on startup. This is especially helpful in scenarios where an application requires specific data sets immediately after it begins execution.
Let's take an example of a video streaming site that has hundreds of videos that need to be available to the user the moment the user accesses the site. For such an application, cache can be pre-loaded with existing videos on cache startup instead of manually adding the data to it.
• At times, the Cache Loader might fail to load data successfully. This might occur as a result of a connectivity issue with the primary data source or due to an error while executing the custom cache loader implementation. To identify this, server-side cache logs for Cache Loader must be checked for errors/exceptions.
Loading your cache with data on startup can be very useful. By doing so, you can avoid performance issues that arise at the time of cache startup because at that time the cache is empty and initial data requests are made to the database (which is slow). The NCache Cache Startup Loader feature will help you pre-load your cache with data of your choice at the time of startup.
As useful as pre-loading the data is, there is a huge chance of the loaded data in the cache becoming stale. The data is loaded in the cache on cache startup and for any change occurring in the data source, the pre-loaded cache data becomes outdated. To keep the loaded data fresh, NCache provides another feature called Cache Refresher. Cache refresher is responsible for synching loaded data in the cache with the updated data in the data source.
Cache Loader and Refresher Properties
The NCache Cache Loader and Refresher are essential features to boost the overall performance of your applications, especially at the time of cache startup. Like every feature, these features also have their own respective properties. These properties are explained below:
In the earlier NCache versions, the cache and the cache loader used to run in the same process which overburdened the cache process, especially when the loader was being executed. This resulted in temporary degradation in the performance of the overall cache.
Now, for OutProc topologies, NCache has dedicated a Loader Service to manage tasks and load data from the data source into caches on cache startup. For a clustered topology, each node has a dedicated service, responsible for loading data into its cache. For InProc topology, the task is executed in the same process.
For clustered topologies, if the data being loaded on a single node is taking up a considerate amount of time, NCache provides the option to distribute the data load among nodes of the cluster. The data is distributed based on datasets provided by the user for every single node. Each node has a loader service assigned to load the data according to the datasets.
NCache internally assigns the datasets to the nodes. This ensures that no two nodes end up loading the same data in the cache. This also allows huge volume of data to be loaded in lesser time.
Datasets are distributed among the cluster nodes in a round-robin fashion. Therefore, each of the servers is assigned a dataset from the list. As one of the nodes finishes loading data against its dataset, the next dataset is assigned to it. In case the number of distribution datasets is greater than the number of nodes, NCache will assign one dataset to each node, and when all nodes have been assigned datasets, the next dataset would be assigned to the available node which has finished loading data.
Let’s say if the user wants to load specific data from the Northwind database into a clustered cache of 3 nodes on startup, then the Cache Loader performance is affected by the number of datasets being assigned. This behavior is explained below:
5 datasets to load: The user allocates 5 datasets (Customer, Order, Products, Employees, and Suppliers) to the loader. The coordinator node then assigns the datasets to the nodes in a Round Robin manner – Customer to node1, Orders to node2, and Products to node3. As soon as a node finishes loading data, the coordinator assigns the next dataset i.e. Employees and eventually Suppliers to the next available node.
3 datasets to load: The user assigns 3 datasets (Customers, Products and Orders) to the loader. This means that each node is responsible for the dataset assigned to it, so it will load the data according to the dataset while ensuring equal distribution.
2 datasets to load: The user assigns 2 datasets (Customers and Products) to the loader. Since the cluster consists of three nodes, the third node will be idle during the loading process. That is why it is preferred that the number of datasets is equal to or greater than the number of nodes so maximum utilization is ensured.
Datasets need to be scheduled for refreshing. For this purpose, a scheduling option is provided that decides the time interval after which data should be updated in the cache. At this time, the refresh interval checks if there are any datasets to be updated, thus updating/refreshing the consequent data in the cache. The four different schedule options provided with Cache Refresher work as follows:
Daily Interval: The Dataset is scheduled to be refreshed at an interval after cache starts and the value is provided in minutes. For example, an interval of 20 minutes means that after every 20 minutes the dataset is scheduled to be refreshed.
Daily Time: The Dataset is scheduled to be refreshed every day at a specific time provided by the user.
Weekly: The Dataset is refreshed on the specific days every week on the time specified by the user. For example, if you want your loaded datasets to be refreshed on every Monday, Thursday, and Saturday at 12:00 am exactly, you need to set weekly dataset scheduling.
Monthly: The Dataset is refreshed on one or multiple specified days every month and one or multiple week days per week. For example, you can specify refreshing of the dataset in such a way that the service refreshes it on every Monday of every first and last week of the month.
The schedule expression has the format
week:days:hours:minutes to specify the scheduling expression.
- Week can be 0-3, 0 being the first week of the month.
- Days can be 1-7 showing the days of the week. The hours and minutes can be according to the time of the day for scheduling. You can specify more than one day of the week by adding the days separated by a comma.
- Multiple weeks can be selected from a month for scheduling.
Let us take a few examples to understand how the scheduling expression works:
The schedule expression
0,1:2:12:00shows that the datasets are scheduled to be refreshed on the 2nd day of the first and second week of the month at 12:00 pm.
The schedule expression
0:1,2,7:15:30shows that the datasets are scheduled to be refreshed on the first, second and seventh day of the first week of the month i.e Monday, Tuesday and Sunday at 3:30 pm.
The user specifies the implementation of which objects need to be loaded
from the master data source. Every individual datum is encapsulated into
CacheItem which is added to the cache on cache startup.
Cache Loader Retries
In case an operation fails while loading the cache, it will be performed before proceeding to the next one. By default, NCache does not retry to perform the failed operation. However, this option can be enabled, and the number of retries can be configured through the NCache Web Manager or the NCache PowerShell cmdlets.
Cache Loader Retry Interval
If the user opts to enable retries for failed operations, the user can also specify the time interval in seconds to wait before trying the failed operation again. The interval is 0 by default and can be configured through the NCache Web Manager or the NCache PowerShell cmdlets.
In order to check which datasets need to be updated/refreshed, a thread runs after a specific time period known as the refresh interval. By default, the refresh interval is set to 15 minutes. The minimum value for this interval is 1 minute, and the maximum value can be set up to 60 minutes. The refresh interval can be configured through the NCache Web Manager or the NCache PowerShell cmdlets.
On Demand Dataset Refresh
The user also has the option to refresh their datasets manually through the Invoke-RefresherDataset cmdlet. Through this cmdlet, the user can either refresh their datasets immediately or within the next 24 hours using the
RefreshPreference option of this cmdlet.