Network link problems
HCL Workload Automation has a high degree of fault tolerance in the event of a communications problem. Each fault-tolerant agent has its own copy of the Symphony file, containing the production period's processing. When link failures occur, they continue processing using their own copies of the Symphony file. Any inter-workstation dependencies, however, must be resolved locally using appropriate console manager commands: deldep and release, for example.
While a link is down, any messages destined for a non-communicating
workstations are stored by the sending workstations in the <TWA_home>/TWS/pobox
directory,
in files named <workstation>.msg
. When
the links are restored, the workstations begin sending their stored
messages. If the links to a domain manager are down for an extended
period of time, it might be necessary to switch to a backup (see IBM® Workload Scheduler:
Administration Guide).
- The conman submit job and submit schedule commands can be issued on an agent that
cannot communicate with its domain manager, provided that you configure (and they can make) a direct
HTTP connection to the master domain manager. This is
configured using the conman connection options in the
localopts
file, or the corresponding options in theuseropts
file for the user (see the IBM® Workload Scheduler: Administration Guide for details).However, all events have to pass through the domain manager, so although jobs and job streams can be submitted, their progress can only be monitored locally, not at the master domain manager. It is thus always important to attempt to correct the link problem as soon as possible.
- If the link to a standard agent workstation is lost, there is no temporary recovery option available, because standard agents are hosted by their domain managers. In networks with a large number of standard agents, you can choose to switch to a backup domain manager.