How operating system clusters work

The method of clustering that HCL Domino® uses is called application clustering. Domino®, which is an application, monitors the cluster and determines when failover and workload balancing should occur, based on parameters that you set. Another form of clustering is operating system clustering. In this form of clustering, the operating system monitors the cluster and determines when failover should occur. When failover occurs, the server (called a node) to which you fail over takes over the resources of the failed node, accesses the storage space the failed node was accessing, and runs the applications the failed node was running.

There are two basic methods of running operating system clusters, active-passive and active-active. In an active-passive cluster, passive nodes do not run their own applications but instead stand by to take over if active nodes fail. In an active-active cluster, the nodes all run their own applications but are also available to take over if other nodes in the cluster fail. In addition, you can configure an operating system cluster to fail over only when there is a hardware failure or to fail over when there is either a hardware failure or a software failure.

Because Domino® uses application clustering, this section does not give detailed information about the various methods and configurations that operating system clusters use. However, because you can run Domino® in conjunction with several operating system clusters, including High Availability Cluster Multi-Processing (HACMP™) and Microsoft™ Cluster Server (MSCS). This section describes basic information about operating system clusters.

Note: For information about configuring your operating system cluster software to run with Domino®, see the documentation that came with your operating system cluster.

Operating system clusters provide failover that is transparent to users. Because the receiving node takes over the resources of the failed node, the user sees the same server name and same network address as on the original server. Unlike many operating system clusters, Domino® clustering does intelligent failover. When a server fails, Domino® checks its cluster cache to find the server that is most available in the cluster. Domino® also lets you actively control workload balancing, which operating clusters may not offer. In addition, Domino® clustering lets you set up clusters of servers that run different operating systems, while operating system clusters require that all nodes run the same operating system.

To run Domino® in an active-active cluster, you must use Domino® partitioned servers on the nodes. Doing so lets each node take over the tasks of the other node while also maintaining its own tasks.

To use an active-active configuration, you must be sure that each node can handle the load of the other node if failover occurs.

Benefits of using OS clusters with Domino® clusters

When you use an operating system cluster in conjunction with a Domino® cluster, the few things that do not fail over in a Domino® cluster will fail over in the operating system cluster. Here are a few examples:

Note: For these features, it is a good idea to set up an active-passive operating system cluster to run in conjunction with the Domino® cluster.

Most Domino® agents do not fail over, so when a server fails over in a Domino® cluster, agents that were running do not continue running on the new server. If these agents are configured to run on a specific server, they will not be able to run on another server after Domino® failover occurs. In an operating system cluster, however, the same server name is used after failover occurs. Therefore, the agents can run on this server. In an operating system cluster, agents that were running on a schedule when failover occurred will restart the next time they are scheduled to run.
If you have applications that use hard-coded server names, the applications will not work if they fail over to a different Domino® server. These applications will run after failing over in an operating system cluster, however, because the server name is still the same.
If a user is editing a document when the server fails, the user can't save the document in a Domino® cluster. The user has to paste the document into a replica on the new server. In an operating system cluster, however, users can save documents that they were editing when the server failed.
The Administration Process does not fail over in Domino®. Therefore, it is useful to set up operating system clusters for your administration servers.