|
NCache Clustering Topologies
|
|
NCache provides a rich set of
clustering topologies to let you
pick the one that suits your
requirements best. Please note
that NCache clustering is not
the same as Windows Clustering.
NCache forms its own cache-level
cluster by using TCP protocol
and can do so on any Windows XP,
2000, 2003, or Vista platform.
NCache forms a cluster among
server nodes. These nodes
actually cache data but the
nature of this caching changes
depending on what clustering
topology you have selected.
Below are all the different
clustering topologies mentioned.
|
Local or Remote Clients
You can choose to run your
application on cache server
nodes or on remote client nodes.
A server node is one that has
the cache and participates in
the cache cluster. And, a remote
client node is any other node
connected to one of the cache
server nodes. In case of remote
clients, you simply install
NCache on your remote node and
use the same
NCache API to access the cache
remotely. Your application is
unaware of the fact that cache
access is remote.
Here is how you
would see this. Please note that
you can remotely access all
caching topologies in a
transparent manner. Please note
that each client node is
connected to one server node.
You can determine which node to
connect either through a client
configuration file or point to a
load balancer that routes the
connection to an appropriate
server node. Additionally, if
the server you're connected to
goes down, the client node
automatically connects to the
next node in the server nodes
list. And, this is initially
specified in the configuration
file but then is updated at
run-time when new nodes join the
cluster.

Replicated Cache
A
clustered replicated cache
consists of two or more cache
servers forming a cluster. It allows all the data in the
cache to be replicated in its
entirety to all the nodes in the cluster.
One benefit of replicated cache
is that your cache exists in
more than one place so if any
server goes down, you don't lose
any cache data. Another benefit
is that all cache server nodes
always find the data locally
when your application does a GET
operation. However, the drawback
is that the cost of updates
grows as you grow the cluster
size because you have to update
all nodes when you do an ADD,
INSERT, or REMOVE operation.
Another drawback is that the
total size of the cache is
limited to your memory size of
any one node since all the cache
is copied to all the nodes.
Therefore, a replicated cache is
suitable for small clusters,
typically 2-6 server nodes. You
can off course have a lot more
client nodes and this is only a
server node limitation. If your
cluster needs to be larger then
consider using one of the
Partitioned Cache clusters.
Below
is a diagram showing a
replicated cluster.

Partitioned
Cache
Partitioned cache is best suited
for situations where the amount
of data being cached is so large
that it cannot easily fit in one
node or when the number of
client nodes is large (50-100 or
more) and you have to grow the
size of server nodes to handle
such large number of clients. At
that time, if the sever nodes
are more than 2-6, then it does
not make sense using Replicated
Cache and instead you should use
Partitioned Cache. Partitioned cache allows
you to break up the cache into
pieces and store each piece on
different nodes in the cluster. This is a very
powerful topology that can
support large environments
easily.

Partitioned cache by default
evenly distributed the data to
all nodes in the cluster. It
uses a range-based hash code
mapping to nodes to determine
where should a particular data
be kept. This way, NCache can
store the data on a remote node
with only one network trip
because all nodes know which
hash code range they're supposed
to keep. Similarly, when you do
a GET, NCache knows which nodes
has the data and gives it back
to you in one network trip at
most. This scheme makes
Partitioned Cache very scalable
because regardless of the
cluster size, the cost of an ADD
or GET is the same.
NCache provides location
transparency which means that
your application does not know
where the data is actually
stored. It makes the same API
call to an NCache server node
and the same server node returns
the data whether it was stored locally on
the server node or remotely on
another server node in the
cluster.
By default, a partitioned cache
automatically determines where
data should be stored when the
client application adds an item
to the cache. However, if you
want to control which partition
should keep what data (e.g. to
handle geographically separated
clusters) you can use groups and
then map these groups to
specific partitions to specify
location affinity. Then, when
the application adds an item to
the cache and specifies its
group, NCache knows which
partition should store this data
and directly stores it there.
The same is the case upon a GET
operation.

Partitioned Cache with Replicas
(Partition-Replica)
NCache also provides a variant
of partitioned cache called
"Partitioned Cache with
Replicas" or simply
Partition-Replica caching
topology. This is the same as
regular partitioned cache but
allows you to have "active
replicas" for each partition. These
replica nodes are not "passive"
and are instead "active"
(meaning your application can
directly interact with them just
like the partition node).
Each replica node forms
a replicated cluster with its
partition node and you can have
more than one replica nodes for
each partition. This
means that you can either use
replicas as backup nodes (e.g.
one backup node for each
partition as shown in the
diagram below) or you can create
multiple replicated clusters all
bound together through a higher
level partitioned cache. This
allows you to store
geographically specific data in
each partition and at the same
time have multiple servers
(replicas) contain this data for
scalability.

Another variation of
partition-replica is to use the
partition nodes as replicas of
each other. This way, you don't
need to have dedicated replica
nodes and can use the server
nodes keeping the partitions to
serve as replicas of other
partitions in the cluster. Below
is a diagram showing you this.

As you see, Server 1 keeps
Partition 1 and Replica 2 which
is a replica for Partition 2.
The same goes for other nodes
and you achieve the goal of
having all the cache at two
locations so if any node goes
down, you don't lose any cache
data.
Client Cache
If your application is on a
remote client node and accessing
the clustered cache remotely,
you may want to use a client
cache to hold a subset of the
clustered cache. This becomes a
cache on top of a cache giving
you further performance boost by
reducing trips even to the
clustered cache. NCache provides
you such facility where it makes
sure that your client cache is
always synchronized with the
clustered cache. If the client
cache were not synchronized with
the clustered cache, then it
would lead to a data integrity
problem where the client cache
might have an older copy of the
data and the clustered cache has
the latest copy. That is why
synchronization is very
important.
Here is how it goes. When your
application is adding any data
to the clustered cache, it also
adds it to the client cache and
specifies a
cache-sync-dependency between
the item in the local cache and
the item in the remote cache.
Both items don't even need to
have the same cache keys. Then,
whenever another application
changes that item in the
clustered cache successfully, an
event is sent to the client
cache to update its copy of that
item with the new item
automatically. Your application
does not have to do anything.

InProc and OutProc Access
If your application is running
on a server node (the node that
is participating in the
cluster), you can access the
cache either as InProc or
OutProc. For InProc, the cache
lives inside your application
process and performs all the
clustering operations from
there. For OutProc, you have to
start the cache independently
and then connect to it. Both
access modes have their own pros
and cons.
InProc mode in a replicated
cache can give you really fast
GET operations since you're
accessing all the data from
within your own memory space.
However, since your application
and the cache are sharing the
same memory, you may face memory
size limitation (in terms of how
much you can cache).
Similarly, OutProc has the
benefit that multiple
applications on the same machine
can share a common cache.
Additionally, since the cache
lives in its own memory, you
have more memory available to
you. But, there is an overhead
of transferring data between
your application process and the
NCache process.

|