Posting incidents and sharing information through a chat tool
When a problem arises, reacting is crucial. Identifying the issue, gathering possible solutions, choosing the best way to proceed are the fundamentals of problem-solving. In this realm, rapid communication becomes critical. By integrating with IBM Z ChatOps, HCL Workload Automation for Z provides you with a chat tool where you are notified about incidents and can share information with other team members. You are alerted through the chat platform of your choice (such as Slack, Microsoft Teams, or Mattermost) and communicate with the other chat users to share data and perform actions. Collaboration becomes easy, immediate, and effective for promoting teamwork and addressing daily issues.
- Add the EQQINCID DD statement to the HCL Workload Automation for Z JCL procedure. For detailed information about EQQINCID, see Incident data set (EQQINCID).
- Define the INCOPTS statement.
- In the INCIDENT parameter of the ALERTS statement, set the alert conditions for which to post an incident through the chat tool.
- Add the certificates that you have downloaded from your incident notifying tool to a
key ring that is trusted by HCL Workload Automation for Z, as follows:
- If you use a SAF ring, perform the following steps:
-
- Create the sequential data sets where the downloaded certificates are
to be stored. In this procedure, the certificates
INCTOOL.CERT.SERVER
andINCTOOL.CERT.ROOT
are used as an example.If your certificate is chained, you must create a data set for each intermediate or root certificate and save them.
- Create a key ring (in this example,
EQQRING
) by using the certificates management command RACDCERT. Skip this step if you already use a key ring for HCL Workload Automation for Z services. (For more information about the RACDCERT command, see the section RACDCERT (Manage RACF digital certificates) in the IBM z/OS Server Security RACF Command Language Reference manual).RACDCERT ADDRING(EQQRING) ID(Your_RACF_userID)
- Add each certificate to the RACF
database:
RACDCERT CERTAUTH ID(Your_RACF_userID) ADD('INCTOOL.CERT.ROOT') TRUST WITHLABEL('INCTOOL ROOT')
RACDCERT CERTAUTH ID(Your_RACF_userID) ADD('INCTOOL.CERT.SERVER') TRUST WITHLABEL('INCTOOL SERVER')
- Connect each certificate to the EQQRING key ring
:
RACDCERT ID(Your_RACF_userID) CONNECT(LABEL('INCTOOL ROOT') RING(EQQRING) USAGE(CERTAUTH)
RACDCERT ID(Your_RACF_userID) CONNECT(LABEL('INCTOOL SERVER') RING(EQQRING) USAGE(PERSONAL)
- Check that the certificates have been successfully added to the
chain:
RACDCERT ID(Your_RACF_userID) LISTCHAIN(LABEL('INCTOOL SERVER')
- Update the SSL parameters in the HTTPOPTS statement according to the values that you have set in this procedure.
- Create the sequential data sets where the downloaded certificates are
to be stored. In this procedure, the certificates
- If you use the keystore in the UNIX System Services, perform the following steps:
-
- Save the downloaded certificates into a USS directory. In this
procedure,
/u/mycerts
is used as an example.If your certificate is chained, you must create a file for each intermediate or root certificate and save them.
- From
/u/mycerts
, create a keystore database (in this procedure, thegskkyman
utility is used). Skip this step if you already use a database for HCL Workload Automation for Z services.gskkyman
- From the Database Menu, select option
1 - Create new database
. Skip this step if you already use a database for HCL Workload Automation for Z services.On completion, the following message is issued:Key database /u/mycerts/my_db_name.kdb created
- To store your database password in a file, from the Key Management
Menu select option
10 - Store database password
. Skip this step if you already use a database for HCL Workload Automation for Z services.On completion, the following message is issued:Database password stored in /u/mycerts/my_db_name.sth
- From the Database Menu, select option
2 - Open database
and enter the key database name and database password. - Import each certificate to your keystore database by selecting option
7 - Import a certificate
from the Key Management Menu. - Based to the values that you set in this procedure, update the SSL parameters in the HTTPOPTS statement.
- Save the downloaded certificates into a USS directory. In this
procedure,
- DURATION
- An operation in the current plan is active for an unexpectedly long time.
- ERROROPER
- An operation in the current plan is set to ended-in-error status.
- HIGHRISK
- The risk level of a critical operation in the current plan has become High.
- LATEOPER
- An operation in the current plan becomes late, which means that it reaches its latest start time and does not have the status started, complete, or deleted.
- OPCERROR
- An HCL Workload Automation for Z subtask or subsystem ends unexpectedly.
- POTENTRISK
- The risk level of a critical operation in the current plan has become Potential.
- SPECRES
- The time that an operation in the current plan is waiting to allocate a given resource exceeds the time specified by the RESOPTS CONTENTIONTIME parameter.
- WLMOPER
- An operation in the current plan is promoted by WLM.
- The members containing the text of the incidents, which you set in ALERTS INCIDENT.
- A member named RULES (required).
This member contains the rules that must be met for the incidents to be notified. Each rule consists of a
FILTER
,HEADER
, and optionally aTEXTMEMBER
, in the following format:FILTER(expression1, expression2, ..., expressionn) HEADER(header_text) [TEXTMEMBER(member_name)]
Note: Each rule is associated with only oneWhere:FILTER
,HEADER
, andTEXTMEMBER
. If within a single rule you specify more than oneFILTER
,HEADER
, orTEXTMEMBER
, only the first occurrence is taken into account.FILTER(expression1, expression2, ..., expressionn)
- The expressions to be satisfied for the incident to be notified, separated
by commas. The incident is notified when all the expressions in the filter
are met; for each satisfied filter the corresponding incident is posted.
EachFor example, you can set a
expression
has the following format, which is not case-sensitive:
where:value=filter
value
- String of alphanumeric characters, included variables (for details about variables, see Variables allowed in the EQQINCID members). It cannot contain blanks.
filter
- String of alphanumeric characters. It cannot contain blanks. You can use the wildcard characters asterisk (*) and percent sign (%).
FILTER
that includes all the applications whose name begins with MY and ended with error code 16, as follows:FILTER(&OADID=MY*,&OERRCODE=16)
HEADER(header_text)
- Information used for the incident header, separated by blanks. As the
header_text
you can specify the following information.Note:- Each piece of information (
Summary
,Priority
, andSeverity
) is followed by colons (:) and can be set only once. If you specify more than one, only the first is considered. - The sign colons (:) cannot be specified inside the
header_text
. If you specify it, the text that follows is not considered.
Summary:
- Required.
Priority:
- Optional. Valid values are
high
,medium
,low
. The default islow
. Severity:
- Optional. Valid values are
fatal
,critical
,major
,minor
,low
. The default islow
.
For example, you can set aHEADER
as follows:HEADER( Summary: This is the application error Priority: High Severity: Fatal )
- Each piece of information (
TEXTMEMBER (member_name)
- Optional. The member containing the text of the incident. If you do not specify any, the member set in ALERTS INCIDENT is used as default. For each alert condition, one member is defined.
Variable name (must be preceded by &) | Variable description | Max length | Alert condition |
---|---|---|---|
ALERCOND | Alert condition that generated the incident (for details, see the alert
conditions listed in ALERTS INCIDENT). It can assume the following values:
|
10 | DURATION, ERROROPER, HIGHRISK, LATEOPER, OPCERROR, POTENTRISK, SPECRES, WLMOPER |
OADID | Application ID | 16 | DURATION, ERROROPER, HIGHRISK, LATEOPER, POTENTRISK, SPECRES, WLMOPER |
OADOWNER | Occurrence owner. | 16 | DURATION, ERROROPER, LATEOPER, SPECRES, WLMOPER |
OTOKEN | Occurrence token. | 8 | DURATION, ERROROPER, LATEOPER, SPECRES, WLMOPER |
OAUGROUP | Authority group. | 8 | DURATION, ERROROPER, HIGHRISK, LATEOPER, POTENTRISK, SPECRES, WLMOPER |
ODMY1 | Occurrence input arrival date, DDMMYY. | 6 | DURATION, ERROROPER, HIGHRISK, LATEOPER, POTENTRISK, SPECRES, WLMOPER |
OJOBNAME | Operation job name. | 8 | DURATION, ERROROPER, HIGHRISK, LATEOPER, POTENTRISK, SPECRES, WLMOPER |
OOPNO | Operation number within the occurrence, right-justified and padded with zeros. | 3 | DURATION, ERROROPER, HIGHRISK, LATEOPER, POTENTRISK, SPECRES, WLMOPER |
OWSID | Workstation ID for the current operation. | 4 | DURATION, ERROROPER, HIGHRISK, LATEOPER, POTENTRISK, SPECRES, WLMOPER |
OJOBID | Job number. | 8 | DURATION, ERROROPER, WLMOPER |
OERRCODE | Error code. | 4 | ERROROPER |
RESNAME | Resource name. | 44 | SPECRES |
RESWTTM | Resource waiting time | 4 | SPECRES |
TASKNAME | HCL Workload Automation for Z task name. | 16 | OPCERROR |
Troubleshooting
When errors occur in detecting and notifying an incident, messages are logged in the
EQQMLOG file. You can set a further level of diagnosis by adding DIAGNOSE
MONFLAGS(X'00000200')
to the member of the EQQPARM library.