How operating system clusters work
The method of clustering that HCL Domino® uses is
called application clustering.
Domino®,
which is an application, monitors the cluster and determines when
failover and workload balancing should occur, based on parameters
that you set. Another form of clustering is operating system clustering.
In
this form of clustering, the operating system monitors the cluster
and determines when failover should occur. When failover occurs, the
server (called a node) to which you fail over takes over the resources
of the failed node, accesses the storage space the failed node was
accessing, and runs the applications the failed node was running.
There are two basic methods of running operating system clusters, active-passive and active-active. In an active-passive cluster, passive nodes do not run their own applications but instead stand by to take over if active nodes fail. In an active-active cluster, the nodes all run their own applications but are also available to take over if other nodes in the cluster fail. In addition, you can configure an operating system cluster to fail over only when there is a hardware failure or to fail over when there is either a hardware failure or a software failure.
Because Domino® uses application clustering, this section does not give detailed information about the various methods and configurations that operating system clusters use. However, because you can run Domino® in conjunction with several operating system clusters, including High Availability Cluster Multi-Processing (HACMP™) and Microsoft™ Cluster Server (MSCS). This section describes basic information about operating system clusters.
Operating system clusters provide failover that is transparent to users. Because the receiving node takes over the resources of the failed node, the user sees the same server name and same network address as on the original server. Unlike many operating system clusters, Domino® clustering does intelligent failover. When a server fails, Domino® checks its cluster cache to find the server that is most available in the cluster. Domino® also lets you actively control workload balancing, which operating clusters may not offer. In addition, Domino® clustering lets you set up clusters of servers that run different operating systems, while operating system clusters require that all nodes run the same operating system.
To run Domino® in an active-active cluster, you must use Domino® partitioned servers on the nodes. Doing so lets each node take over the tasks of the other node while also maintaining its own tasks.
To use an active-active configuration, you must be sure that each node can handle the load of the other node if failover occurs.
Benefits of using OS clusters with Domino® clusters
- Most Domino® agents do not fail over, so when a server fails over in a Domino® cluster, agents that were running do not continue running on the new server. If these agents are configured to run on a specific server, they will not be able to run on another server after Domino® failover occurs. In an operating system cluster, however, the same server name is used after failover occurs. Therefore, the agents can run on this server. In an operating system cluster, agents that were running on a schedule when failover occurred will restart the next time they are scheduled to run.
- If you have applications that use hard-coded server names, the applications will not work if they fail over to a different Domino® server. These applications will run after failing over in an operating system cluster, however, because the server name is still the same.
- If a user is editing a document when the server fails, the user can't save the document in a Domino® cluster. The user has to paste the document into a replica on the new server. In an operating system cluster, however, users can save documents that they were editing when the server failed.
- The Administration Process does not fail over in Domino®. Therefore, it is useful to set up operating system clusters for your administration servers.