SQL Query: Behavior and Usage Overview
This feature is only available in NCache Enterprise Edition.
NCache is a key-value cache where every new object is stored against a unique key. Similar to a relational database where the primary key finds data, NCache allows you to retrieve that data by specifying the corresponding key. However, many practical scenarios require the capability to search flexibly to acquire data based on different attributes, not just the primary key. Since most high transactional applications keep data in the cache, the ability to query data is highly desirable.
Importance of Querying in Cache
Assume that there is an e-commerce website, where the products are queried based on a range of criteria such as category, price, weight, etc. To improve the application performance, using a distributed caching system seems to be a befitting solution. However, losing the query capability or rewriting the application is not acceptable.
In such a scenario, NCache is an ideal distributed cache that can scale the application performance without compromising the query capability. Specifically, NCache enables you to query cached data based on different attributes using SQL-like query syntax. By employing NCache for the previously mentioned scenario, you can query for all the products by using category, price, or weight as your criteria.
NCache allows searching and deleting data in a cache using SQL-like query syntax.
NCache provides the flexibility to retrieve projections, the whole object, the cache keys only, or the key-value pair itself based on your requirement. Moreover, NCache also supports the basic query operators, logical operators, and aggregate functions provided by SQL.
How it Works
In NCache, the data is distributed across different cluster nodes due to partitioning. Searching for a customer using its ID might be easy enough when there are a thousand customers. However, finding one in a million would not have the same complexity, when the records are distributed across different nodes, and the query criteria can be any customer attribute. In this context, SQL query in the cache requires an efficient data-searching mechanism.
In particular, NCache employs indexing and distributed queries to support real-time SQL-like searching capability in the cache. Indexing makes an attribute searchable in the cache. Once an item is added against an index, it returns as a query result on the node to which the client is connected. Meanwhile, a query is executed in a distributed fashion when data is distributed in the Partitioned Topology.
An attribute must be indexed to be searchable in the cache.
Traditional databases use indexing to boost query performance. However, NCache uses indexing as mandatory to enable searching through SQL-like queries. To query cache data, all searchable attributes need to be indexed first. It is important to note that the fields specified in projections and criteria must be indexed. Once an index is defined for an attribute, any relevant data added to the cache is automatically indexed and becomes searchable. Hence, defining indexes once is sufficient.
Read operations are much faster due to indexing, but there is an overhead for each write operation.
NCache uses indexing since it is a performance-oriented approach. Indexing eliminates the need to query every entry in the cache while searching for a query result set. This expedites the searching mechanism and saves time and computing power.
Do not index unnecessary fields, since it would result in memory and performance overhead.
The query executed in NCache varies based on the cache topology and the underlying client connectivity. In the case of the Mirrored or Replicated Topology, the query is executed only on the node with which the client connects. However, this does not occur when data partitions across the cluster nodes.
The queries over a partitioned cluster execute in a distributed manner primarily due to data partitioning. The query executes individually on each node, since the client connects to all the cluster nodes. The result set consisting of the required data is returned from each node and merged before returning the final result to the client. Here, the performance also relies on the number of cluster nodes and the size of the result set returned by each node.
The Distributed Query in NCache is highly scalable.
The Distributed Query yields higher scalability in NCache due to the partitioned data and its parallel execution. When new nodes join the cluster, the partitioned data for each member reduces. Therefore, the query execution time spent by each server node reduces. Moreover, the network traffic also reduces - due to the concurrent execution on each node.
SQL Query in NCache
NCache allows you to search and delete cache data using SQL-like SELECT and DELETE queries, respectively. INSERT and UPDATE operations are currently not supported. ExecuteReader is used for executing SELECT query in the cache and returns the result set to an instance of ICacheReader. ExecuteScalar executes the query, and returns the first column of the first row in the result set returned by the query, any additional columns or rows are ignored.
ExecuteNonQuery is used for deleting items from the cache. You can query all primitive data types and nullable types provided that query indexes have been defined.
Make sure you close the instance of
ICacheReader after execution, as it is necessary to clean the resources.
The querying language lets you specify non-primary key criteria depending on which data is required. You can execute queries to query data in the cache using different projections and criteria. However, the attributes specified in projections and criteria must be indexed. Moreover, based on your requirement, you have the flexibility to retrieve the cache keys only or whole items fulfilling the given query criteria.
If you want to query for an attribute in the cache, it should first be indexed. NCache provides several flexible ways for indexing. Considering the need for predefined (static) indexes, NCache lets you manually define indexes for searchable attributes in advance. One way to define static indexes is through configuration. Alternatively, you can also define indexes programmatically.
On the other hand, dynamic indexes such as Groups, Tags, and Named Tags are automatically created at runtime for an attribute. Runtime indexes allow you to query data in the cache based on logical Groups, Tags, and Named Tags.
Query Using SQL Projections
NCache allows you to retrieve all the indexed attributes of a class or specific projections from the cache store based on your query. NCache supports different projections to facilitate efficient searching. Projections play an important role in the performance enhancement of your application, when it comes to queries. For instance, you want to retrieve only the available units column for products in a specific category. In that case, you can use the following SELECT query.
SELECT UnitsInStock FROM FQN.Product WHERE Category = "Sample_Category"
For performance enhancement, it is best to only query the essential columns, since returning the complete object is costly.
It is worth noticing, by retrieving only the desired columns, the additional overhead of retrieving all the indexed attributes of the
Product class is avoided. In this way, NCache facilitates the users to achieve enhanced performance through improved query syntax.
Query for Keys and Objects
Based on your requirement, you can retrieve the cache keys only or the associated data with your query. If you need to enumerate the keys only, NCache allows you to retrieve keys only. It is an efficient approach since the size of data to be fetched from the cache significantly reduces this way. Hence, you can use it for enhanced performance.
ExecuteReader has an option
getData, which is set as false for only key retrieval. By default, its value is true. If you need data for further processing, the default value can be used to query for objects.
Supported Query Operators
You can use a rich set of operators provided by NCache according to your required query criteria to fetch desired result sets from your cache.
The following operator categories are supported in NCache queries:
|Basic Query Operators||= , == , != , <> , < , > , <=, >=, IN, LIKE , NOT LIKE|
|Logical Operators||AND, OR , NOT|
|Aggregate Functions||SUM, COUNT, AVG, MIN, MAX|
|Miscellaneous||GROUP BY, ORDER BY, DateTime.Now, DateTime ("any date time compatible string")|
For further details, please refer to the Query Operators section.
NCache assumes the Client Cache (L1 cache) might have partial data. Considering that complete data is not guaranteed in the L1 cache, the queries always occur on the server cache (L2) in NCache. If the L1 cache is enabled on the client end executing the query, the query execution behavior will not be affected. The query will execute directly on the server cache. However, when data changes in clustered cache due to a DELETE query, L1 synchronizes with the L2 cache.
Although client cache does not influence query execution in NCache, it can play an essential role in performance enhancement. For instance, you want to perform further operations on the query result set, but fetching the data from the cache might hurt your application performance. If the client cache is configured with an L2 cache, you can retrieve the desired data with a single
GetBulk call. This saves time and avoids performance degradation, since operations performed on the client cache are much faster as compared to the L2 cache.