Self-Healing 100% Uptime Dynamic Clustering - TayzGrid

TayzGrid has a peer-to-peer cluster architecture. This architecture provides a self-healing dynamic clustering capability, with 100% uptime. TayzGrid creates its own TCP based dynamic cluster of cache servers instead of using OS-based clustering. It also allows addition or removal of servers from the data grid cluster at runtime.

The following capabilities are available through dynamic clustering:

  • 100% cluster uptime
  • Addition or removal of TayzGrid servers at runtime without stopping the cluster
  • Load balancing of client connections to TayzGrid servers (at the time of connection)
  • In case of server going down, clients automatically reconnect to a different data grid server

Peer to Peer Dynamic Cluster

The peer-to-peer architecture of the dynamic cluster ensures there is no single point of failure in the cluster.

A collection of one or more data grid servers is called an In-Memory Data Grid cluster. In a cluster, every server is connected to every other server. An In-Memory Data Grid cluster always contains a cluster coordinator which is the oldest (i.e first) server in the cluster. The job of the cluster coordinator is to manage all memberships within the cluster. If the coordinator node goes down, this role is passed on to the oldest server among the rest in the cluster. This way, any single point of failure is removed from management of the cluster members.

TayzGrid Peer to Peer Dynamic Cluster

Runtime Discovery within Cluster

A new data grid server can be added to the data grid cluster at runtime using the runtime discovery algorithm. This algorithm requires the new server to have knowledge of at least one other server in the cluster when it starts. Usually, multiple data grid servers are listed in the algorithm configuration file where a connection is established with any one of the listed servers. Then, it determines the identity of the cluster coordinator and finallyrequests the coordinator to be added in the membership list of the cluster.

The coordinator adds this new server to the cluster membership list at runtime. Then it a) informs all the other servers in the cluster about the new server and b) provides information about the existing cluster members to the new server. The new server connects with all the other servers in the cluster using TCP connections.

If the new server cannot find any other server when it starts then it becomes the cluster coordinator and forms a new cluster.

Runtime Discovery by Clients

For connecting to the data grid cluster, the local or remote client only needs to know about one of the data grid servers.

After establishing the connection, the client receives information about the host from the server it is connected to at runtime. This information helps the client to determine data grid servers it needs to connect to. It also helps determine how to access the data. At runtime, the client can get the following information from data grid servers:

  • Cluster membership information: This information is sent to the client in two cases: (1) when it connects to any one of the data grid server and (2), in case of any change in cluster membership. This means that after any addition or removal of data grid servers from the cluster, the client is informed.
  • Caching topology information: This information is sent to the client after it connects to the server. This information is useful for the client to determine the cache server(s) it needs to establish a connection with. More details about this can be found at data grid topologies page of this website.
  • Data distribution map: The data distribution map is provided only if the caching topology is Partitioned Cache or Partition-Replica Cache. This information is useful for clients to determine the location of data in the cluster. The client can therefore directly access the data from the cache server. The data distribution map is provided in two cases: (1) at the time of the client's connection to the server and (2) if when any change occurs in the partitioning map because a new server has been added or removed from the cluster.


In conclusion, only one data grid server name has to be specified in the data grid client configuration file although it is recommended that you specify as many servers as possible for redundancy purposes. The data grid client immediately receives information about all the other servers in the cluster if it connects to any one server in the cluster. Then, it can decide if/whether to make more connections depending on the data grid topology.

This dynamic configuration propagation simplifies the data grid client configuration. This is because most of the information is either being kept in the data grid server configuration or within the data grid cluster at runtime.

Failover Support: Add/Remove Servers at Runtime

TayzGird's self-healing 100% uptime dynamic clustering provides complete failover support. TayzGrid supports two types of failovers:

  • Failover support within cluster: The In-Memory Data Grid cluster is self-healing and automatically adjusts itself if a cluster change occurs. This means that cluster membership information is updated and propagated to all the servers within the cluster so each data grid server can update its connections to all other data grid servers.
  • Failover support for cache clients: All the clients connected to a data grid automatically adjust themselves if a data grid server is added or removed. All clients of a particular server connect to some other data grid servers if a particular server is removed from the cluster. Similarly, all clients get information about a new data grid server and can choose whether to connect to it.

dynamic cluster

Failover support allows addition or removal of data grid servers from the cluster at runtime without stopping your application. TayzGrid ensures uninterrupted and lossless execution of your application. Data loss is prevented through various replication schemes provided by TayzGrid. These replication schemes have been discussed on the data grid topologies page.

Local and Remote Clients

The TayzGrid cluster can be accessed through local clients, remote clients or a combination of the two. Remote clients access the data grid from across the network, outside of the data grid cluster. Local clients access the data grid from same cluster but in a separate process.

ncache local clients

Local client libraries are automatically installed on the data grid server machines. On the other hand, the remote client libraries need to be installed separately on the web application server.

Hot Apply Configuration Changes

TayzGrid allows addition or removal of data grid servers at runtime. It also allows change in some of the data grid configuration information at runtime. These changes can be Hot Applied, which means that these changes can be made without stopping the data grid or application servers.

What To Do Next?