Example of configuring connection management for a high-availability replication system
You can use a Connection Manager to route client connections for the participants of a replicate set and to control failover for high-availability clusters that participate in Enterprise Replication.
About this task
- server_1a - ER Node 1, primary server
- server_1b - SD secondary server
- server_1c - HDR secondary server
- server_1d - RS secondary server
- server_2 - ER Node 2
- server_3 - ER Node 3
- server_4 - ER Node 4
The grid supports reporting services, which can run on any of the ER nodes.
- Client requests are directed to the ER node with the lowest transaction latency.
- The system can withstand the failure of a Connection Manager.
- The system can withstand a network-interface card (NIC) failure on each host.
- The Connection Managers control failover for the cluster.
- If failover occurs, it the SD secondary server takes priority over the HDR secondary server. The HDR secondary server takes priority over the RS secondary server.
- If the primary server fails, the Connection Managers can still connect to the cluster after restarting.
Procedure
- Install at least two network interface cards on each host.
This prevents the failure of a network interface card from causing Connection Manager or database server failure.
- Install two Connection Managers. Install each Connection
Manager onto a different host, and do not install the Connection Managers
onto the hosts that database servers are installed on.This installation strategy prevents a Connection Manager from becoming a single point of failure, and prevents the simultaneous failure of database servers and Connection Managers if a host fails.
- On each Connection Manager host, set the INFORMIXDIR environment
to the directory the Connection Manager was installed into.
Run the following command:
setenv INFORMIXDIR path
- Create a configuration file in each Connection Manager
installation's $INFORMIXDIR/etc directory.
The first Connection Manager's configuration file is named cm_1.cfg and has the following entries:
NAME connection_manger_1 LOG 1 LOGFILE $INFORMIXDIR/tmp/my_cm1_log.log LOCAL_IP 192.0.0.2,192.0.2.1 REPLSET replicate_set_1 { INFORMIXSERVER g_server_1,g_server_2,g_server_3,g_server_4 SLA report_1 DBSERVERS=ANY \ POLICY=LATENCY } CLUSTER cluster_1 { INFORMIXSERVER g_server_1 FOC ORDER=ENABLED \ PRIORITY=1 CMALARMPROGRAM $INFORMIXDIR/etc/CMALARMPROGRAM.sh }
The second Connection Manager's configuration file is named cm_2.cfg and has the following entries:NAME connection_manger_2 LOG 1 LOGFILE $INFORMIXDIR/tmp/my_cm2_log_.log LOCAL_IP 192.0.2.2,192.0.2.3 REPLSET replicate_set_1 { INFORMIXSERVER g_server_1,g_server_2,g_server_3,g_server_4 SLA report_2 DBSERVERS=ANY \ POLICY=LATENCY } CLUSTER cluster_1 { INFORMIXSERVER g_server_1 FOC ORDER=ENABLED \ PRIORITY=2 CMALARMPROGRAM $INFORMIXDIR/etc/CMALARMPROGRAM.sh }
The configuration file specifies the following information and behavior:- Logging is enabled, and the log files are $INFORMIXDIR/tmp/my_cm1_log.log and $INFORMIXDIR/tmp/my_cm2_log.log.
- connection_manager_1 monitors 192.0.2.0 and 192.0.2.1 and connection_manager_2 monitors 192.0.2.2 and 192.0.2.3 for network failure.
- When the Connection Managers start, they each search their sqlhosts files for g_server_1, g_server_2, g_server_3, and g_server_4 entries, and then connect to the servers server_1a, server_1b, server_1c, server_1d,server_2, server_3, and server_4 that are in those groups.
CONNECT TO @report_1
andCONNECT TO @report_2
connection requests are directed to the replication server that has the lowest transaction latency.- The connection between connection_manager_1 and the primary server is prioritized over the connection between connection_manager_2 and the primary server. Failover that would break the connectivity between connection_manager_1 and the primary server is blocked.
- If failover processing fails after eight attempts, $INFORMIXDIR/etc/CMALARMPROGRAM.sh is called.
- The EVENT_TIMEOUT parameter is not set, so the Connection Managers wait 60 seconds for primary-server events before failover processing begins. The SECONDARY_EVENT_TIMEOUT parameter is not set, so the Connection Managers wait 60 seconds for secondary-server events before the Connection Manager disconnects from the secondary server.
- The HOST, NETTYPE, SERVICE, and SQLHOSTSOPT attributes of the SLA parameters are not set, so each Connection Manager uses connection information in local and remote sqlhosts files.
- The SQLHOSTS parameter is not set, so each Connection Manager first searches its local sqlhosts file, and then remote database server sqlhosts files for connectivity information related to server_1, server_2, server_3, and server_4.
- The WORKERS attributes of the SLA parameters are not set, so four worker threads are allocated to each of the SLAs.
- Set the onconfig file DRAUTO configuration
parameter on server_1a, server_1b, server_1c,
and server_1d to
3
DRAUTO 3
This setting specifies that a Connection Manager controls failover arbitration. - Set the onconfig file HA_FOC_ORDER
configuration parameter on server_1a to
SDS,HDR,RSS
HA_FOC_ORDER SDS,HDR,RSS
After the Connection Managers start, and connect to server_1a, the HA_FOC_ORDER value replaces the value of the
ORDER
attributes in each Connection Manager's configuration file.If server_1a fails, the Connection Managers attempt failover to the SD secondary server. If the SD secondary server is also unavailable, the Connection Managers attempt failover to the HDR secondary server. If the HDR secondary server is also unavailable, the Connection Managers attempt failover to the RS secondary server.
- Add entries to thesqlhosts files on
the hosts of each database server and Connection Manager.
#dbservername nettype hostname servicename options g_server_1 group - - i=1,c=1,e=server_1d server_1a onsoctcp host_1 port_1 g=g_server_1 server_1b onsoctcp host_1 port_2 g=g_server_1 server_1c onsoctcp host_2 port_3 g=g_server_1 server_1d onsoctcp host_3 port_4 g=g_server_1 g_server_2 group - - i=2,e=server_2 server_2 onsoctcp host_4 port_5 g=g_server_2 g_server_3 group - - i=3,e=server_3 server_3 onsoctcp host_5 port_6 g=g_server_3 g_server_4 group - - i=4,e=server_4 server_4 onsoctcp host_6 port_7 g=g_server_4
- Create a sqlhosts file on each client
host.
#dbservername nettype hostname servicename options report group - - c=1,e=report_2 report_1 onsoctcp cm_host_1 cm_port_3 g=report report_2 onsoctcp cm_host_2 cm_port_4 g=report
If a Connection Manager fails, client applications can still connect to the other Connection Manager because the report group is defined.
CONNECT TO @report
connection requests are directed through one of the Connection Managers to the replication server that has the lowest transaction latency. - Set each INFORMIXSQLHOSTS environment
variable to the sqlhosts file location by running
the setenv command on each Connection Manager and
client host.
setenv INFORMIXSQLHOSTS path_and_file_name
- Turn on quality of data (QOD) monitoring by running the cdr
define qod command.
cdr define qod -c server_1a --start
The command connects to server_1a, defines server_1a as a master server for monitoring data, and then turns on quality of data monitoring.server_1a monitors transaction latency for the replication servers in the grid.
- Run the oncmsm utility on each Connection
Manager host, to start each Connection Manager.On the host of connection_manager_1:
oncmsm -c cm_1.cfg
On the host of connection_manager_2:oncmsm -c cm_2.cfg
- Check each Connection Manager's log file to verify that the Connection Manager started correctly.