Cookie Consent by Free Privacy Policy Generator Managing Data Relationships in Distributed Cache

Managing Data Relationships in Distributed Cache

Author: Iqbal Khan

Introduction

A distributed cache lets you greatly improve application performance and scalability. This is because an in-memory cache is a lot faster for data access than a database. Similarly, unlike with databases, you can increase the number of cache servers you are using to increase storage capacity and facilitate more requests per second, thus increasing scalability.

Unfortunately, many in-memory caches suffer from the fact that they utilize simple hash tables with key-value pairs to store data, which are not effective for relational data. Essentially, each item is stored in the cache independently without any knowledge of any other related items. This makes it hard for applications to track relationships between cached items. If one item is updated or removed in the database, any related items may also change, but the cache remains unaware and is not able to reflect those changes.

A typical real-life application deals with relational data that has one-to-one, many-to-one, one-to-many, and many-to-many relationships with other data elements in the database. This requires referential integrity to be maintained across different related data elements. Therefore, in order to preserve data integrity in the cache, the cache must understand these relationships and maintain the same referential integrity.

One way to maintain these relationships is the ASP.NET Cache Dependency that Microsoft introduced. In this case, when a cached item is updated or removed, all related items are automatically invalidated to maintain data integrity. When the application can't find the related items in the cache, it queries the database for the most recent versions, restores them to the cache, and re-establishes referential integrity.

However, ASP.NET Cache is limited to single-server environments, so for scalability, a distributed cache is needed to operate independently and across multiple servers. Fortunately, NCache provides the same Cache Dependency feature in a distributed environment.

Although NCache provides various types of dependencies, including Data Dependency, File Dependency, SQL Dependency, and Custom Dependency, this article only discusses the Data Dependency for handling relationships among cached items.

Using Data Dependency in cache?

Below is a brief example of how to use Data Dependency to specify a multi-level dependency.

public static void CreateDependencies(ICache _cache)
{
    try
    {
        string keyC = "objectC-1000";
        Object objC = new Object();
        string keyB = "objectB-1000";
        Object objB = new Object();
        string keyA = "objectA-1000";
        Object objA = new Object();
        // Initializing cacheItems
        var itemOne = new CacheItem(objA);
        var itemTwo = new CacheItem(objB);
        var itemThree = new CacheItem(objC);
        // Adding objA dependent on objB
        itemOne.Dependency = new KeyDependency(keyB);
      // Adding objB dependent on objC
        itemTwo.Dependency = new KeyDependency(keyC);
        //Adding items to cache
        _cache.Add(keyC, itemThree);
        _cache.Add(keyB, itemTwo);
        _cache.Add(keyA, itemOne);

        // Removing "objC" automatically removes "objB" as well as "objA"
        _cache.Remove(keyC);
        _cache.Dispose();
    }
    catch (Exception e)
    {
        throw;
    }
}

Data Relationships

The following example is used in this article to demonstrate how various types of relationships are handled in the cache.

Managing Data Relationships
Figure 2: Relationships in the Database

In the diagram shown above, the following relationships are shown:

  • One to Many: There are two such relationships:
    1. Customer to Order
    2. Product to Order
  • Many to One: There are two such relationships:
    1. Order to Customer
    2. Order to Product
  • Many to Many: There is one such relationship: Customer to Product (via Order)

For the above relationships, the following domain objects are designed.

class Customer
    {
        public string CustomerID;
        public string CompanyName;
        public string ContactName;
        public string ContactTitle;
        public string Phone;
        public string Country;
        public IList<Order> _OrderList;
    }
 class Product
    {
        public int ProductID;
        public string ProductName;
        public Decimal UnitPrice;
        public int UnitsInStock;
        public int UnitsOnOrder;
        public int ReorderLevel;
        public IList<Order> _OrderList;
    }
 class Order
    {
        public int OrderId;
        public string CustomerID;
        public DateTime OrderDate;
        public DateTime RequiredDate;
        public DateTime ShippedDate;
        public int ProductID;
        public Decimal UnitPrice;
        public int Quantity;
        public Single Discount;
        public Customer _Customer;
        public Product _Product;
    }

As you can see, the Customer and Product classes contain an _OrderList, which contains a list of all Order objects that are related to this customer. Similarly, the Order class contains the _Customer and _Product data members pointing to the related Customer or Product object. Now, the source code responsible for loading data must ensure that whenever a Customer is fetched, all related Orders are loaded as well.

The following example demonstrates how each of these relationships is managed in the cache.

Handling One-to-One/Many-to-One Relationships

Whenever you have fetched an object from the cache that also has a one-to-one or many-to-one relationship with another object, your source code might have also loaded the related object. However, it is not always required to load the related object because the application may not need it at that time. If your source code has loaded the related object, then you need to handle it.

There are two ways you can handle this. I will call one optimistic and the other pessimistic way, and will explain each of them below:

  1. Optimistic handling of relationships: In this, we assume that even though there are relationships, nobody else is going to modify the related object separately. Whoever wants to modify the related objects will fetch them through the primary object in the cache and will therefore be in a position to modify both primary and related objects. In this case, we do not have to store both of these objects separately in the cache. Therefore, the primary object contains the related object, and both of them are stored as one cached item in the cache. And, no Data Dependency is created between them.
  2. Pessimistic handling of relationships: In this case, you assume that the related object can be independently fetched and updated by another user, and therefore, the related object must be stored as a separate cached item. Then, if anybody updates or removes the related object, you want your primary object to also be removed from the cache. In this case, you'll create a Data Dependency between the two objects.

Below is the source code for handling the optimistic approach. Please note that both the primary object and its related objects are cached as one item because the serialization of the primary object would also include the related objects.

static void Main(string[] args)
{
    string cacheName = "myReplicatedCache";
    ICache _cache = CacheManager.GetCache(cacheName);
    OrderFactory oFactory = new OrderFactory();
    Order order = new Order();
    order.OrderId = 1000;
    oFactory.LoadFromDb(order);
    Customer cust = order._Customer;
    Product prod = order._Product;
    var itemOne = new CacheItem(order);
    // please note that Order object serialization will
    // also include Customer and Product objects
    _cache.Add(order.OrderId.ToString(), itemOne);
    _cache.Dispose();
}

Below is the source code for handling the pessimistic approach, since the optimistic scenario does not require any use of Data Dependency.

static void Main(string[] args)
{
    string cacheName = "myReplicatedCache";
    ICache _cache = CacheManager.GetCache(cacheName);
    OrderFactory oFactory = new OrderFactory();
    Order order = new Order();
    order.OrderId = 1000;
    oFactory.LoadFromDb(order);
    Customer cust = order._Customer;
    Product prod = order._Product;
    string custKey = "Customer:CustomerID:" + cust.CustomerID;
    _cache.Insert(custKey, cust);
    string prodKey = "Product:ProductID:" + prod.ProductID;
    _cache.Insert(prodKey, prod);
    string[] depKeys = { prodKey, custKey };
    string orderKey = "Order:OrderID:" + order.OrderId;
    // We are setting _Customer and _Product to null so they
    // don't get serialized with Order object
    order._Customer = null;
    order._Product = null;
    var item = new CacheItem(order);
    item.Dependency = new CacheDependency(null, depKeys);
    _cache.Add(orderKey, item);
    _cache.Dispose();
}

The code above loads an Order object from the database, while both the Customer and Product objects load automatically because the Order object has a many-to-one relationship with them. The application adds the Customer and Product objects to the cache first, followed by the Order object, which is set to depend on both. This way, if any of these objects are updated or removed in the cache, the Order object is automatically removed from the cache to preserve data integrity. The application does not have to keep track of this relationship.

Handling One-to-Many Relationships

Whenever you have fetched an object from the cache that also has a one-to-many relationship with another object, your source code may load both the primary object and a collection of all the related objects. However, it is not always necessary to load the related objects as the application may not need them at this time. Although if it does load the related objects, the fact that they are in a collection introduces further issues as discussed below:

  1. Optimistic Handling of Relationships: In this case, the related objects are modified separately, even though they are beholden to the previously data relationships. Whoever wants to modify the related objects will fetch them through the primary object in the cache and will, therefore, be in a position to modify both the primary and related objects. Thus, we do not have to store both of these objects separately in the cache.
  2. Mildly Pessimistic Handling of Relationships: In this approach, related objects are not fetched individually but only as part of a complete collection. As a result, the entire collection is stored as a single cached item, with a dependency on the primary object. This way, if the primary object is updated or removed, the related collection is also invalidated to maintain consistency.
  3. Really Pessimistic Handling of Relationships: In this case, you assume that all objects in the related collection can also be fetched individually by the application and modified. Therefore, you must not only store the collection but also all its individual objects in the cache separately. Please note that this would likely cause performance issues because you're making multiple trips to the cache, which may be residing across the network on a cache server.

Below is an example of how you can handle one-to-many relationships optimistically.

static void Main(string[] args)
{
    string cacheName = "ltq";
    ICache _cache = CacheManager.GetCache(cacheName);
    CustomerFactory cFactory = new CustomerFactory();
    Customer cust = new Customer();
    cust.CustomerID = "ALFKI";
    cFactory.LoadFromDb(cust);
    // please note that _OrderList will automatically get
    // serialized along with the Customer object
    string custKey = "Customer:CustomerID:" + cust.CustomerID;
    _cache.Add(custKey, cust);
    _cache.Dispose();
}

Below is an example of how to handle a one-to-many relationship mildly pessimistically.

static void Main(string[] args)
{
    string cacheName = "myReplicatedCache";
    ICache _cache = CacheManager.GetCache(cacheName);
    CustomerFactory cFactory = new CustomerFactory();
    Customer cust = new Customer();
    cust.CustomerID = "ALFKI";
    cFactory.LoadFromDb(cust);
    IList<Order> orderList = cust._OrderList;
    // please note that _OrderList will not be 
    // serialized along with the Customer object
    cust._OrderList = null;
    string custKey = "Customer:CustomerID:" + cust.CustomerID;
    var custItem = new CacheItem(cust);
    _cache.Add(custKey, custItem);
    // let's reset the _OrderList back
    cust._OrderList = orderList;
    string[] depKeys = { custKey };
    string orderListKey = "Customer:OrderList:CustomerId" + cust.CustomerID;
    IDictionary<string, CacheItem> dictionary = new Dictionary<string, CacheItem>();
    foreach (var order in orderList)
    {
        var orderItem = new CacheItem(order);
        orderItem.Dependency = new CacheDependency(null, depKeys);
        dictionary.Add(orderListKey, orderItem);

    }
    _cache.AddBulk(dictionary);
    _cache.Dispose();
}

In the above example, a separate cache stores the list of Order objects associated with the Customer. The entire collection is cached as one item because we are assuming that nobody will directly modify individual Order objects separately. The application will always fetch it through the Customer and modify and re-cache the entire collection again.

Another case is the pessimistic handling of one-to-many relationships, which is similar to how we handle collections in the cache. That topic is discussed in the next section.

Handling Collections in the Cache

There are many situations where you fetch a collection of objects from the database. This could be due to a query you ran, or it could be a one-to-many relationship returning a collection of related objects on the "many" side. In either case, you get a collection of objects that must be handled in the cache appropriately.

There are two ways to handle collections, as explained below:

  1. Optimistic Handling of Collections: In this, we assume that the entire collection should be cached as one item because nobody will individually fetch and modify the objects kept inside the collection.
  2. Pessimistic Handling of Collections: In this case, we assume that individual objects inside the collection can be fetched separately and modified. Therefore, we cache the entire collection, but then also cache each individual object and create a dependency from the collection to the individual objects.

Below is an example of how to handle collections optimistically.

static void Main(string[] args)
{
    string cacheName = "myReplicatedCache";
    ICache _cache = CacheManager.GetCache(cacheName);
    CustomerFactory cFactory = new CustomerFactory();
    Customer cust = new Customer();
    string custListKey = "CustomerList:LoadByCountry:Country:United States";
    IList<Customer> custList = cFactory.LoadByCountry("United States");
    IDistributedList<Customer> list = _cache.DataTypeManager.CreateList<Customer>(custListKey);

    // please note that all Customer objects kept in custList
    // will be serialized along with the custList
    foreach (var customer in custList)
    {
        // Add products to list
        list.Add(customer);
    }
    _cache.Dispose();
}

In the example discussed above, the entire collection is cached as one item, and all the Customer objects kept inside the collection are automatically serialized. Therefore, there is no need to create any Data Dependency here.

Below is an example of how to handle collections pessimistically.

static void Main(string[] args)
{
    string cacheName = "myReplicatedCache";
    ICache _cache = CacheManager.GetCache(cacheName);
    CustomerFactory cFactory = new CustomerFactory();
    Customer cust = new Customer();
    IList<Customer> custList = cFactory.LoadByCountry("United States");
    ArrayList custKeys = new ArrayList();
    // Let's cache individual Customer objects and also build
    // an array of keys to be used later in CacheDependency
    foreach (Customer c in custList)
    {
        string custKey = "Customer:CustomerID:" + c.CustomerID;
        custKeys.Add(custKey);
        _cache.Insert(custKey, c);
    }
    string custListKey = "CustomerList:LoadByCountry:Country:United States";
    // please note that this collection has a dependency on all
    // objects in it separately. So, if any of them are updated or
    // removed, this collection will also be removed from cache
    IDistributedList<Customer> list = _cache.DataTypeManager.CreateList<Customer>(custListKey);
    foreach (var customer in custList)
    {
        var item = new CacheItem(customer);
        item.Dependency = new CacheDependency(null, (string[])custKeys.ToArray());
        list.Add(customer);
    }

    _cache.Dispose();
}

In the example shown above, each object in the collection is cached as a separate item, and then the entire collection is cached as well as one item. The collection has a Data Dependency on all its objects that are cached separately. This way, if any of these objects are updated or removed, the collection is also removed from the cache.


Author: Iqbal Khan works for Alachisoft, a leading software company providing .NET distributed caching solutions. You can reach him at iqbal@alachisoft.com.

© Copyright Alachisoft 2002 - . All rights reserved. NCache is a registered trademark of Diyatech Corp.