Running event-driven workload automation

Event-driven workload automation adds the capability to perform on-demand workload automation in addition to plan-based job scheduling. It provides the capability to define rules that can trigger on-demand workload automation.

The object of event-driven workload automation in HCL Workload Automation is to carry out a predefined set of actions in response to events that occur on nodes running HCL Workload Automation (but also on non-HCL Workload Automation ones, when you use the sendevent command line). This implies the capability to submit workload and run commands on the fly, notify users via email.

The main tasks of event-driven workload automation are:
  • Trigger the execution of batch jobs and job streams based on the reception or combination of real time events.
  • Reply to prompts
  • Notify users when anomalous conditions occur in the HCL Workload Automation scheduling environment or batch scheduling activity.
  • Invoke an external product when a particular event condition occurs.
Event-driven workload automation is based upon the concept of event rule. In HCL Workload Automation an event rule is a scheduling object that includes the following items:
  • Events
  • Event-correlating conditions
  • Actions
When you define an event rule, you specify one or more events, a correlation rule, and the one or more actions that are triggered by those events. Moreover, you can specify validity dates, a daily time interval of activity, and a common time zone for all the time restrictions that are set.
The events that HCL Workload Automation can detect for action triggering can be:
Internal events
They are events involving HCL Workload Automation internal application status and changes in the status of HCL Workload Automation objects. Events of this category can be job or job stream status changes, critical jobs or job streams being late or canceled, and workstation status changes.
External events
They are events not directly involving HCL Workload Automation that may nonetheless impact workload submission. Events of this category can be messages written in log files, events sent by third party applications, or a file being created, updated, or deleted.

Within a rule, two or more events can be correlated through correlation attributes such as a common workstation or job. The correlation attributes provide a way to direct the rule to create a separate rule (or copy of itself) for each group of events that share common characteristics. Typically, each active rule has one copy that is running in the event processing server. However, sometimes the same rule is needed for different groups of events, which are often related to different groups of resources. Using one or more correlation attributes is a method for directing a rule to create a separate rule copy for each group of events with common characteristics.

The actions that HCL Workload Automation can run when it detects any of these events can be:
Operational actions
They are actions that cause the change in the status of scheduling objects. Actions of this category are submitting a job, job stream, or command, or replying to a prompt.
Notification actions
They are actions that have no impact on the status of scheduling objects. Actions belonging to this category are sending an email, logging the event in an internal auditing database, or running a non-HCL Workload Automation command.
This classification of events and actions is conceptual. It has no impact on how they are handled by the event-driven mechanism.

Simple event rule scenarios

This section lists some simple scenarios involving the use of event rules. The corresponding XML coding is shown in Event rule examples.
Scenario 1: Send email notification
  1. The administrator defines the following event rule:
    • When any of the job123 jobs terminates in error and yields the following error message:
      AWSBHT001E The job "MYWORKSTATION#JOBS.JOB1234" in file "ls" has
      failed with the error: AWSBDW009E The following operating system
      error occurred retrieving the password structure for either the
      logon user...
      send an email to operator john.smith@mycorp.com. The subject of the email includes the names of the job instance and of the associated workstation.

      The event rule is valid from December 1st to December 31st in the 12:00-16:00 EST time window.

  2. The administrator saves the rule as non-draft in the database and it is readily deployed by HCL Workload Automation.
  3. The scheduler starts monitoring the jobs and every time one of them ends in error, John Smith is sent an email so that he can check the job and take corrective action.
Scenario 2: Monitor that workstation links back
  1. The administrator defines the following event rule:
    • If workstation CPU1 becomes unlinked and does not link back within 10 minutes, send a notification email to chuck.derry@mycorp.com.
  2. The administrator saves the rule as non-draft in the database and it is readily deployed by HCL Workload Automation.
  3. The scheduler starts monitoring CPU1.

    If the workstation status becomes unlinked, HCL Workload Automation starts the 10 minute timeout. If the CPU1 linked event is not received within 10 minutes, the scheduler sends the notification email to Chuck Derry.

  4. Chuck Derry receives the email, queries the actions/rules that were triggered in the last 10 minutes, and from there navigates to the CPU1 instance and runs a first problem analysis.
Scenario 3: Submit job stream when FTP has completed
  1. The administrator defines the following event rule:
    • When file daytransac* is created in the SFoperation directory in workstation system1, and modifications to the file have terminated, submit the calmonthlyrev job stream.

      The event rule is valid year-round in the 18:00-22:00 EST time window.

  2. The administrator saves the rule as non-draft in the database and it is readily deployed by HCL Workload Automation.
  3. The scheduler starts monitoring the SFoperation directory. As soon as file daytransac* is created and is no longer in use, it submits job stream calmonthlyrev.
  4. The operator can check the logs to find if the event rule or the job stream were run.
Scenario 4: Start long duration jobs based on timeout
  1. The administrator defines the following event rule:
    • When the job-x=exec event and the job-x=succ/abend event are received in 5 minutes, the scheduler should reply Yes to prompt-1 and start the jobstream-z job stream, otherwise it should send an email to twsoper@mycompany.com alerting that the job is late.
  2. The administrator saves the event rule in draft status. After a few days the administrator edits the rule, changes the email recipient and saves it as non-draft. The rule is deployed.
  3. Every time the status of job-x becomes exec, HCL Workload Automation starts the 5 minutes timeout.

    If the internal state of job-x does not change to succ or abend within 5 minutes and the corresponding event is not received, HCL Workload Automation sends the email, otherwise it replies Yes to the prompt and submits jobstream-z.

Scenario 5: Monitoring process status and running a batch script
The administrator creates a rule to monitor the status of HCL Workload Automation processes and run a batch script.
Scenario 6: Monitoring the Symphony file status and logging the occurrence of a corrupt record
The administrator creates a rule to monitor the status of the Symphony file in the HCL Workload Automation instance and logs the occurrence of a corrupt Symphony dependency record in the internal auditing database.

For a detailed example about how to set up an event rule that monitors the used disk space (TivoliWorkloadSchedulerFileSystemFilling), see Monitoring the disk space used by HCL Workload Automation