IBM Workload Automation

Introduction

You can install a full deployment of IBM Workload Automation on the Red Hat OpenShift Container Platform.

An IBM Certified Container meets standard criteria for packaging and deployment of containerized software with platform integrations and accelerates time to value and improves enterprise readiness at a lower cost than containers alone.

The information in this README contains the steps for deploying and running the IBM Workload Automation Operator that is then used to deploy the following IBM Workload Automation components:

IBM Workload Automation, which comprises master domain manager and its backup, Dynamic Workload Console, and Dynamic Agent

For more information about IBM Workload Automation, see the product documentation library in IBM Documentation.

Details

By default, a single server (master domain manager), Dynamic Workload Console (console) and dynamic agent is installed.

To achieve high availability in an IBM Workload Automation environment, the minimum base configuration is composed of 2 Dynamic Workload Consoles and 2 servers (master domain managers). For more details about IBM Workload Automation and high availability, see An active-active high availability scenario.

Workload Automation can be deployed across a single cluster, but you can add multiple instances of the WA Operator and product components by using a different project for each of the Operators you deploy in the cluster. The WA Operators and product components can run in multiple failure zones in a single cluster.

In addition to the product components, the following objects are installed:

	Operator	Agent	Console	Server (MDM)
Deployments	1 - ibm-workload-automation-operator
Pods	ibm-workload-automation-operator-xxxxxxxxxx, where xxxxxxxxxx is a random name	wa-waagent-0	wa-waconsole-0	wa-waserver-0
Stateful Sets		wa-waagent for dynamic agent	wa-waconsole	wa-waserver
Secrets	sa-workload-automation (pull Operator images from entitled registry) sa-{{ .Release.Namespace }} (pull images from entitled registry)		wa-pwd-secret	wa-pwd-secret
Certificates (Secret)		wa-waagent	wa-waserver	wa-waserver
Network Policy		da-network-policy	dwc-network-policy	mdm-network-policy allow-mdm-to-mdm-network-policy
Services		wa-waagent-h	wa-waconsole wa-waconsole-h	wa-waserver wa-waserver-h
Routes			wa-console-route	wa-server-route
PVC (generated from Helm chart). Default deployment includes a single (replicaCount=1) server, console, agent. Create a PVC for each instance of each component.		1 PVC data-wa-waagent-waagent0	1 PVC data-wa-waconsole-waconsole0	1 PVC data-wa-waserver-waserver0
PV (Generated by PVC)		1 PV	1 PV	1 PV
Service Accounts	workload-automation-operator wauser (default)
Roles		wa-pod-role	wa-pod-role	wa-pod-role
Role Bindings		wa-pod-role-binding	wa-pod-role-binding	wa-pod-role-binding
Cluster Roles	wa-automation-operator service accounts	{{ .Release.Namespace }}-wa-pod-cluster-role-get-routes (name of clusterRole and, where {{ .Release.Namespace }} represents the name of the project <workload-automation-prject>)	{{ .Release.Namespace }}-wa-pod-cluster-role-get-routes	{{ .Release.Namespace }}-wa-pod-cluster-role-get-routes
Cluster Role Bindings	workload-automation-operator service accounts	{{ .Release.Namespace }}-wa-pod-cluster-role-get-routes-binding (name of ClusterRoleBinding and, where {{ .Release.Namespace }} represents the name of the project <workload-automation-prject>)	{{ .Release.Namespace }}-wa-pod-cluster-role-get-routes-binding	{{ .Release.Namespace }}-wa-pod-cluster-role-get-routes-binding
Custom resources (CR)	WorkloadAutomation (a shared, unique resource across the OCP cluster that remains the same even for additional Operators in different projects)

Data encryption:

Data in transit encrypted using TLS 1.2
Data at rest encrypted using passive disk encryption.
Secrets are stored in an approved Kubernetes Secrets.
Logs are clear of all sensitive information.

Supported Platforms

Red Hat OpenShift Container Platform 4.3 or newer installed on amd64: 64-bit Intel/AMD x86.
Cloud providers:
- Amazon Web Services
- Google Cloud
- OpenShift Platform

Accessing the Certified Container Software Images

You can access the IBM Workload Automation certified container images from the IBM Entitled Registry. For information about obtaining your entitlement key from the IBM Entitled Registry, see the instructions in the README Deploying the IBM Workload Automation Operator.

Prerequisites

Before you begin the deployment process, ensure your environment meets the following prerequisites:

OpenShift Container Platform (OCP) V4.2 or later installed and running
OC (Openshift Command Line) installed
Helm 3
IBM Cloud Platform Common Services (for more info see Common Services:
- IAM
- Certificate management (IBM Certificate manager)
- Monitoring (monitoring dashboard uses Grafana and Prometheus)
IBM Cloud CLI
Kubernetes version: >=1.16 (no specific APIs need to be enabled)
API key for accessing IBM entitled registry: cp.icr.io

Storage classes static PV and dynamic provisioning

Provider	Disk Type	PVC Size	IOPS per GB	PVC Access Mode
GCP Persistent Disk	Standard HDD	Default	Default	ReadWriteOnce
GCP Persistent Disk	Fast SSD	Default	Default	ReadWriteOnce
Ceph	-	Default	Default	ReadWriteOnce
AWS EBS	GP2 SSD	Default	Default	ReadWriteOnce
AWS EBS	IO1 SSD	100 GB	50	ReadWriteOnce
AWS EBS	ST1 HDD	500 GB	Default	ReadWriteOnce

The reclaim policy can be either Retain or Delete.
For more details about the storage requirements for your persistent volume claims, see the Storage section of this README file.

PodSecurityPolicy Requirements

This chart requires a Pod Security Policy to be bound to the target namespace prior to the installation. To meet this requirement there may be cluster scoped as well as namespace scoped pre and post actions that need to occur.

The predefined PodSecurityPolicy name: ibm-restricted-psp has been verified for this chart when deploying Agents, Consoles and a single instance of the Server (ReplicaCount=1). If your target namespace is bound to this PodSecurityPolicy you can proceed to install the chart.

From the user interface, you can copy and paste the following snippets to enable the custom PodSecurityPolicy:

Custom PodSecurityPolicy definition:

 apiVersion: policy/v1beta1
 kind: PodSecurityPolicy
 metadata:
   annotations:
     kubernetes.io/description: "This policy is the most restrictive, requiring pods to run with a non-root UID, and preventing pods from accessing the host." 
     #apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
     #apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
     seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
     seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
   name: ibm-restricted-psp
 spec:
   allowPrivilegeEscalation: false
   forbiddenSysctls:
   - '*'
   fsGroup:
     ranges:
     - max: 65535
       min: 1
     rule: MustRunAs
   requiredDropCapabilities:
   - ALL
   runAsUser:
     rule: MustRunAsNonRoot
   seLinux:
     rule: RunAsAny
   supplementalGroups:
     ranges:
     - max: 65535
       min: 1
     rule: MustRunAs
   volumes:
   - configMap
   - emptyDir
   - projected
   - secret
   - downwardAPI
   - persistentVolumeClaim

Custom ClusterRole for the custom PodSecurityPolicy:

 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRole
 metadata:
   name: ibm-restricted-psp-clusterrole
 rules:
 - apiGroups:
   - extensions
   resourceNames:
   - ibm-restricted-psp
   resources:
   - podsecuritypolicies
   verbs:
   - use

SecurityContextConstraints Requirements

Before installing the Operator, ensure that you apply the SecurityContextConstraints for Workload Automation: ibm-workload-automation-restricted-scc policy.

Custom SecurityContextConstraints definition:

To grant the ibm-workload-automation-restricted-scc policy, a cluster admin can run the command related to the SecurityContextConstraints (SCC) as described in the README Deploying the IBM Workload Automation Operator.

The following is the SCC definition:

apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
  annotations:
    kubernetes.io/description: "This policy is the most restrictive for {Insert Product Name}, 
      requiring pods to run with a non-root UID, and preventing pods from accessing the host.
      The UID and GID will be bound by ranges specified at the Namespace level." 
    cloudpak.ibm.com/version: "1.1.0"
  name: ibm-workload-automation-restricted-scc
  labels:
    app.kubernetes.io/name: ibm-workload-automation-operator
    app.kubernetes.io/instance: ibm-workload-automation-operator-instance
    app.kubernetes.io/version: "1.0.0"
    app.kubernetes.io/managed-by: ibm-workload-automation-operator
allowHostDirVolumePlugin: false
allowHostIPC: false
allowHostNetwork: false
allowHostPID: false
allowHostPorts: false
allowPrivilegedContainer: false
allowPrivilegeEscalation: false
allowedCapabilities: null
allowedFlexVolumes: null
allowedUnsafeSysctls: null
defaultAddCapabilities: null
defaultAllowPrivilegeEscalation: false
forbiddenSysctls:
  - "*"
fsGroup:
  type: MustRunAs
  ranges:
  - max: 65535
    min: 1
readOnlyRootFilesystem: false
requiredDropCapabilities:
- ALL
runAsUser:
  type: MustRunAsNonRoot
seccompProfiles:
- docker/default
# This can be customized for seLinuxOptions specific to your host machine
seLinuxContext:
  type: RunAsAny
# seLinuxOptions:
#   level:
#   user:
#   role:
#   type:
supplementalGroups:
  type: MustRunAs
  ranges:
  - max: 65535
    min: 1
# This can be customized to host specifics
volumes:
- configMap
- downwardAPI
- emptyDir
- persistentVolumeClaim
- projected
- secret

Resources Required

The following resources correspond to the default values required to manage a production environment. These numbers might vary depending on the environment.

Component	Container resource limit	Container resource request
WA Operator	CPU: 500m, Memory: 750Mi	CPU: 250m, Memory: 500Mi
Server	CPU: 4, Memory: 16Gi	CPU: 1, Memory: 6Gi, Storage: 10Gi
Console	CPU: 4, Memory: 16Gi	CPU: 1, Memory: 4Gi, Storage: 5Gi
Dynamic Agent	CPU: 1, Memory: 2Gi	CPU: 200m, Memory: 200Mi, Storage size: 2Gi

Required Roles

The following table outlines the personas and related OCP cluster role required to perform each task:

Use Case	WA Personas	OCP Cluster Role
Install CASE (Operator)	Admin	cluster administrator
Install IBM common services	Admin	cluster administrator
IBM CertManager	Admin	cluster administrator
IBM IAM management	Admin	cluster administrator
IBM Grafana/Prometheus	Admin	cluster administrator
Configure the CASE (Install WA components)	Operator	workload-automation-project administrator
Verify the installation	Deployer	workload-automation-project administrator
Install Automation Hub plug-ins	Admin/Operator	workload-automation-project administrator
Scaling the product	Operator	workload-automation-project administrator
Upgrade the CASE (Operator)	Admin	cluster administrator
Upgrade WA components	Operator	workload-automation-project administrator
Rollback the CASE (Operator)	Admin	cluster administrator
Rollback WA components	Operator	workload-automation-project administrator
Add custom Grafana dashboards	Operator	workload-automation-project administrator
Create scheduling objects	Developer	workload-automation-project administrator

For more information about which IBM Workload Automation tasks users can perform depending on the group to which they are assigned, see Users and groups.

Installing

Installing and configuring the CASE (Operator), involves the following high-level steps:

Creating a Kubernetes Secret by accessing the IBM entitled registry to store an entitlement key for the IBM Workload Automation offering on your cluster.
Deploying the IBM Workload Automation Operator.
Securing communication using either IBM cert-manager tool or using your custom certificates.
Creating a secrets file to store passwords for the console and server components, or if you use custom certificates, to add your custom certificates to the Certificates Secret.
Deploying the IBM Workload Automation product components.
Installing Automation Hub plug-ins.

The procedures related to these steps can be found in the following README files:

Configuration

The following tables list the configurable parameters of the chart, an example of the values and the default values. The tables are organized as follows:

Global parameters (all product components)
Agent parameters
Dynamic Workload Console parameters
Server parameters (master domain manager)
Global parameters

The following table lists the global configurable parameters of the chart relative to all product components and an example of their values:

Parameter	Description	Mandatory	Example	Default
global.license	Use ACCEPT to agree to the license agreement	yes	not accepted	not accepted
global.enableServer	If enabled, the Server application is deployed	no	true	true
global.enableConsole	If enabled, the Console application is deployed	no	true	true
global.enableAgent	If enabled, the Agent application is deployed	no	true	true
global.serviceAccountName	The name of the serviceAccount to use. The Workload Automation default service account (wauser) and not the default cluster account	no	default	wauser
global.language	The language of the container internal system. The supported language are: en (English), de (German), es (Spanish), fr (French), it (Italian), ja (Japanese), ko (Korean), pt_BR (Portuguese (BR)), ru (Russian), zh_CN (Simplified Chinese) and zh_TW (Traditional Chinese)	yes	en	en
global.customLabels	This parameter contains two fields: name and value. Insert customizable labels to group resources linked together.	no	name: environment value: prod	name: environment value: prod
global.enablePrometheus	Use to enable (true) or disable (false) Prometheus metrics	no	true	true
global.customPlugins	If specified, the plug-ins and integrations listed in the configMap file are automatically installed when deploying the server and console containers. See Installing Automation Hub integrations in the Case for details about the procedure.	no	mycustomplugin (the value specified must match the value specified in the configMap file)
global.customPluginsImageName	To install a custom plug-in when deploying the server and console containers, specify the name of the Docker registry, the plug-in image, and the tag assigned to the Docker image. See Installing custom plug-ins in the Case for details about the procedure.	no	myregistry/mypluginimage:my_tag

Agent parameters

The following table lists the configurable parameters of the chart relative to the agent and an example of their values:

Parameter	Description	Mandatory	Example	Default
waagent.fsGroupId	The secondary group ID of the user	no	999
waagent.supplementalGroupId	Supplemental group id of the user	no
waagent.replicaCount	Number of replicas to deploy	yes	1	1
waagent.image.repository	@PRODUCT.NAME@ Agent image repository	yes	@DOCKER.AGENT.IMAGE.NAME@	@DOCKER.AGENT.IMAGE.NAME@
waagent.image.tag	@PRODUCT.NAME@ Agent image tag	yes	@VERSION@	@VERSION@
waagent.image.pullPolicy	image pull policy	yes	Always	Always
waagent.licenseType	Product license management (IBM Workload Scheduler only)	yes	PVU	PVU
waagent.agent.name	Agent display name	yes	WA_AGT	WA_AGT
waagent.agent.tz	If used, it sets the TZ operating system environment variable	no	America/Chicago
waagent.agent.networkpolicyEgress	Customize egress policy. Controls network traffic and how a component pod is allowed to communicate with other pods. If empty, no egress policy is defined	no	See Network enablement
waagent.agent.nodeAffinityRequired	A set of rules that determines on which nodes an agent can be deployed using custom labels on nodes and label selectors specified in pods.	no	See Network enablement
waagent.agent.dynamic.server.mdmhostname	Hostname or IP address of the master domain manager	no (mandatory if a server is not present inside the same project)	wamdm.demo.com
waagent.agent.dynamic.server.port	The HTTPS port that the dynamic agent must use to connect to the master domain manager	no	31116	31116
waagent.agent.dynamic.pools*	The static pools of which the Agent should be a member	no	Pool1, Pool2
waagent.agent.dynamic.useCustomizedCert	If true, customized SSL certificates are used to connect to the master domain manager	no	false	false
waagent.agent.dynamic.certSecretName	The name of the secret to store customized SSL certificates	no	waagent-cert-secret
waagent.agent.containerDebug	The container is executed in debug mode	no	no	no
waagent.agent.livenessProbe.initialDelaySeconds	The number of seconds after which the liveness probe starts checking if the server is running	yes	60	60
waagent.resources.requests.cpu	The minimum CPU requested to run	yes	200m	200m
waagent.resources.requests.memory	The minimum memory requested to run	yes	200Mi	200Mi
waagent.resources.limits.cpu	The maximum CPU requested to run	yes	1	1
waagent.resources.limits.memory	The maximum memory requested to run	yes	2Gi	2Gi
waagent.persistence.enabled	If true, persistent volumes for the pods are used	no	true	true
waagent.persistence.useDynamicProvisioning	If true, StorageClasses are used to dynamically create persistent volumes for the pods	no	true	true
waagent.persistence.dataPVC.name	The prefix for the Persistent Volumes Claim name	no	data	data
waagent.persistence.dataPVC.storageClassName	The name of the Storage Class to be used. Leave empty to not use a storage class	no	nfs-dynamic
waagent.persistence.dataPVC.selector.label	Volume label to bind (only limited to single label)	no	my-volume-label
waagent.persistence.dataPVC.selector.value	Volume label value to bind (only limited to single value)	no	my-volume-value
waagent.persistence.dataPVC.size	The minimum size of the Persistent Volume	no	2Gi	2Gi

(*) Note: for details about static agent workstation pools, see:
Workstation.

Dynamic Workload Console parameters

The following table lists the configurable parameters of the chart relative to the console and an example of their values:

Parameter	Description	Mandatory	Example	Default
waconsole.fsGroupId	The secondary group ID of the user	no	999
waconsole.supplementalGroupId	Supplemental group id of the user	no
waconsole.replicaCount	Number of replicas to deploy	yes	1	1
waconsole.image.repository	@PRODUCT.NAME@ Console image repository	yes	@DOCKER.CONSOLE.IMAGE.NAME@	@DOCKER.CONSOLE.IMAGE.NAME@
waconsole.image.tag	@PRODUCT.NAME@ Console image tag	yes	@VERSION@	@VERSION@
waconsole.image.pullPolicy	Image pull policy	yes	Always	Always
waconsole.console.containerDebug	The container is executed in debug mode	no	no	no
waconsole.console.db.type	The preferred remote database server type (e.g. DERBY, DB2, ORACLE, MSSQL, IDS). Use Derby database only for demo or test purposes.	yes	DB2	DB2
waconsole.console.db.hostname	The Hostname or the IP Address of the database server	yes	<dbhostname>
waconsole.console.db.port	The port of the database server	yes	50000	50000
waconsole.console.db.name	Depending on the database type, the name is different; enter the name of the Server’s database for DB2/Informix/MSSQL, enter the Oracle Service Name for Oracle	yes	TWS	TWS
waconsole.console.db.tsName	The name of the DATA table space	no	TWS_DATA
waconsole.console.db.tsPath	The path of the DATA table space	no	TWS_DATA
waconsole.console.db.tsTempName	The name of the TEMP table space (Valid only for Oracle)	no	TEMP	leave it blank
waconsole.console.db.tssbspace	The name of the SB table space (Valid only for IDS).	no	twssbspace	twssbspace
waconsole.console.db.user	The database user who accesses the Console tables on the database server. In case of Oracle, it identifies also the database. It can be specified in a secret too	yes	db2inst1
waconsole.console.db.adminUser	The database user administrator who accesses the Console tables on the database server. It can be specified in a secret too	yes	db2inst1
waconsole.console.db.sslConnection	If true, SSL is used to connect to the database (Valid only for DB2)	no	false	false
waconsole.console.db.usepartitioning	Enable the Oracle Partitioning feature. Valid only for Oracle. Ignored for other databases	no	true	true
waconsole.console.pwdSecretName	The name of the secret to store all passwords	yes	wa-pwd-secret	wa-pwd-secret
waconsole.console.livenessProbe.initialDelaySeconds	The number of seconds after which the liveness probe starts checking if the server is running	yes	100	100
waconsole.console.useCustomizedCert	If true, customized SSL certificates are used to connect to the Dynamic Workload Console	no	false	false
waconsole.console.certSecretName	The name of the secret to store customized SSL certificates	no	waconsole-cert-secret
waconsole.console.libConfigName	The name of the ConfigMap to store all custom liberty configuration	no	libertyConfigMap
waconsole.console.routes.enabled	If true, the ingress controller rules is enabled	no	true	true
waconsole.console.enableSSO	If true, single sign-on for the Dynamic Workload Console is enabled with LDAP. The console is configured in SSO using Json Web Token (JWT) If false, LTPA keyes are used for enabling SSO.	no	true	true
waconsole.console.adminGroup	The group defined in the YAML for LDAP. The group to which the console grants the admin role. For IAM, the team defined in the IAM that coincides with the group defined on the IAM.	no (mandatory when waconsole.console.enableSSO is set to true)	icp:wa-admins:admin	icp:wa-admins:admin
waconsole.resources.requests.cpu	The minimum CPU requested to run	yes	1	1
waconsole.resources.requests.memory	The minimum memory requested to run	yes	4Gi	4Gi
waconsole.resources.limits.cpu	The maximum CUP requested to run	yes	4	4
waconsole.resources.limits.memory	The maximum memory requested to run	yes	16Gi	16Gi
waconsole.persistence.enabled	If true, persistent volumes for the pods are used	no	true	true
waconsole.persistence.useDynamicProvisioning	If true, StorageClasses are used to dynamically create persistent volumes for the pods	no	true	true
waconsole.persistence.dataPVC.name	The prefix for the Persistent Volumes Claim name	no	data	data
waconsole.persistence.dataPVC.storageClassName	The name of the StorageClass to be used. Leave empty to not use a storage class	no	nfs-dynamic
waconsole.persistence.dataPVC.selector.label	Volume label to bind (only limited to single label)	no	my-volume-label
waconsole.persistence.dataPVC.selector.value	Volume label value to bind (only limited to single label)	no	my-volume-value
waconsole.persistence.dataPVC.size	The minimum size of the Persistent Volume	no	5Gi	5Gi
waconsole.console.networkpolicyEgress	Customize egress policy. Controls network traffic and how a component pod is allowed to communicate with other pods. If empty, no egress policy is defined	no	See Network enablement
waconsole.console.nodeAffinityRequired	A set of rules that determines on which nodes a console can be deployed using custom labels on nodes and label selectors specified in pods.	no	See Network enablement

Server parameters

The following table lists the configurable parameters of the chart and an example of their values:

Parameter	Description	Mandatory	Example	Default
waserver.replicaCount	Number of replicas to deploy	yes	1	1
waserver.image.repository	IBM Server image repository	yes	<repository_url>	<repository_url> defined when creating the Operator
waserver.image.tag	IBM Server image tag	yes	1.0.0	the tag specified when creating the Operator
waserver.image.pullPolicy	Image pull policy	yes	Always	Always
waserver.licenseType	Product license management (IBM Workload Scheduler only)	yes	PVU	PVU
waserver.fsGroupId	The secondary group ID of the user	no	999
waserver.server.company	The name of your Company	no	my-company	my-company
waserver.server.agentName	The name to be assigned to the dynamic agent of the Server	no	WA_SAGT	WA_AGT
waserver.server.dateFormat	The date format defined in the plan	no	MM/DD/YYYY	MM/DD/YYYY
waserver.server.timezone	The timezone used in the create plan command	no	America/Chicago
waserver.server.startOfDay	The start time of the plan processing day in 24 hour format: hhmm	no	0000	0700
waserver.server.tz	If used, it sets the TZ operating system environment variable	no	America/Chicago
waserver.server.networkpolicyEgress	Controls network traffic and how a component pod is allowed to communicate with other pods. Customize egress policy. If empty, no egress policy is defined	no	See Network enablement
waserver.server.nodeAffinityRequired	A set of rules that determines on which nodes a server can be deployed using custom labels on nodes and label selectors specified in pods.	no	See Network enablement
waserver.server.createPlan	If true, an automatic JnextPlan is executed at the same time of the container deployment	no	no	no
waserver.server.containerDebug	The container is executed in debug mode	no	no	no
waserver.server.db.type	The preferred remote database server type (e.g. DERBY, DB2, ORACLE, MSSQL, IDS)	yes	DB2	DB2
waserver.server.db.hostname	The Hostname or the IP Address of the database server	yes	<dbhostname>
waserver.server.db.port	The port of the database server	yes	50000	50000
waserver.server.db.name	Depending on the database type, the name is different; enter the name of the Server’s database for DB2/Informix/MSSQL, enter the Oracle Service Name for Oracle	yes	TWS	TWS
waserver.server.db.tsName	The name of the DATA table space	no	TWS_DATA
waserver.server.db.tsPath	The path of the DATA table space	no	TWS_DATA
waserver.server.db.tsLogName	The name of the LOG table space	no	TWS_LOG
waserver.server.db.tsLogPath	The path of the LOG table space	no	TWS_LOG
waserver.server.db.tsPlanName	The name of the PLAN table space	no	TWS_PLAN
waserver.server.db.tsPlanPath	The path of the PLAN table space	no	TWS_PLAN
waserver.server.db.tsTempName	The name of the TEMP table space (Valid only for Oracle)	no	TEMP	leave it empty
waserver.server.db.tssbspace	The name of the SB table space (Valid only for IDS)	no	twssbspace	twssbspace
waserver.server.db.usepartitioning	If true, the Oracle Partitioning feature is enabled. Valid only for Oracle, it is ignored by other databases. The default value is true	no	true	true
waserver.server.db.user	The database user who accesses the Server tables on the database server. In case of Oracle, it identifies also the database. It can be specified in a secret too	yes	db2inst1
waserver.server.db.adminUser	The database user administrator who accesses the Server tables on the database server. It can be specified in a secret too	yes	db2inst1
waserver.server.db.sslConnection	If true, SSL is used to connect to the database (Valid only for DB2)	no	false	false
waserver.server.pwdSecretName	The name of the secret to store all passwords	yes	wa-pwd-secret	wa-pwd-secret
waserver.livenessProbe.initialDelaySeconds	The number of seconds after which the liveness probe starts checking if the server is running	yes	600	850
waserver.readinessProbe.initialDelaySeconds	The number of seconds before the prob starts checking the readiness of the server	yes	600	530
waserver.server.useCustomizedCert	If true, customized SSL certificates are used to connect to the master domain manager	no	false	false
waserver.server.certSecretName	The name of the secret to store customized SSL certificates	no	waserver-cert-secret
waserver.server.libConfigName	The name of the ConfigMap to store all custom liberty configuration	no	libertyConfigMap
waserver.server.enableSSO	If true, single sign-on for the server is enabled with LDAP. The server is configured in SSO using Json Web Token (JWT) If false, LTPA keyes are used for enabling SSO.	no	true	true
waserver.server.adminGroup	The group defined in the YAML for LDAP. The group to which the server grants the admin role. For IAM, the team defined in the IAM that coincides with the group defined on the IAM.	no (mandatory when waserver.server.enableSSO is set to true)	icp:wa-admins:admin	icp:wa-admins:admin
waserver.server.routes.enabled	If true, the routes controller rules is enabled	no	true	true
waserver.server.routes.hostname	The virtual hostname defined in the DNS used to reach the Server	no	server.mycluster.proxy
waserver.resources.requests.cpu	The minimum CPU requested to run	yes	1	1
waserver.resources.requests.memory	The minimum memory requested to run	yes	4Gi	4Gi
waserver.resources.limits.cpu	The maximum CUP requested to run	yes	4	4
waserver.resources.limits.memory	The maximum memory requested to run	yes	16Gi	16Gi
waserver.persistence.enabled	If true, persistent volumes for the pods are used	no	true	true
waserver.persistence.useDynamicProvisioning	If true, StorageClasses are used to dynamically create persistent volumes for the pods	no	true	true
waserver.persistence.dataPVC.name	The prefix for the Persistent Volumes Claim name	no	data	data
waserver.persistence.dataPVC.storageClassName	The name of the StorageClass to be used. Leave empty to not use a storage class	no	nfs-dynamic
waserver.persistence.dataPVC.selector.label	Volume label to bind (only limited to single label)	no	my-volume-label
waserver.persistence.dataPVC.selector.value	Volume label value to bind (only limited to single value)	no	my-volume-value
waserver.persistence.dataPVC.size	The minimum size of the Persistent Volume	no	5Gi	5Gi
waserver.server.ftaName	The name of the IBM Workload Automation workstation for this installation.	no	WA-SERVER

Configuration

Network enablement

You can specify an egress network policy to include a list of allowed egress rules for the server, console, and agent components. Each rule allows traffic leaving the cluster which matches both the “to” and “ports” sections. For example, the following sample demonstrates how to allow egress to another destination:

networkpolicyEgress:

- name: to-mdm
  egress:
  - to:
    - podSelector:
        matchLabels:
	  app.kubernetes.io/name: waserver
    - port: 31116
      protocol: TCP
- name: dns
  egress:
    - to:
      - namespaceSelector:
          matchLabels:
	    name: kube-system
    - ports:
        - port: 53
	  protocol: UDP
	- port: 53
	  protocol: TCP

For more information, see Network Policies.

Node affinity Required

You can also specify node affinity required to determine on which nodes a component can be deployed using custom labels on nodes and label selectors specified in pods. The following is an example:

nodeAffinityRequired:

-key: iwa-node
  operator: In
  values:
  - 'true'

where iwa-node represents the value of the node affinity required.

Scaling the product

By default a single server, console, and agent is installed. If you want to change the topology for IBM Workload Automation, then modify the values of the replicaCount parameter in the YAML file for each component and save the changes. The IBM Workload Automation Operator instance automatically updates the number of instances for each of the components accordingly. See Required Roles for information about the role required to run the scaling procedures.

Scaling up

Perform the following steps to scale up one or more Workload Automation components:
As the admin OCP use, you can scale up the Workload Automation components by performing the following steps from the OCP console:

From the Red Hat OpenShift Container Platform console, go to Operators -> Installed Operators and click the Workload Automation Operator hyperlink. The IBM Workload Automation Operator instance page is displayed.
Select the YAML tab and in the YAML file that displays, increase the value of the replicaCount parameter for the component you want to scale up and click Save.

Note: When you scale up a server, the additional server instances are installed with the Backup Master role, and the workstation definitions are automatically saved to the Workload Automation relational database. To complete the scaling up of a server component, run JnextPlan -for 0000 from the server that has the role of master domain manager to add new backup master workstations to the current plan. The agent workstations installed on the new server instances are automatically saved in the database and linked to the Broker workstation with no further manual actions.

Scaling down

Perform the following steps to scale down one or more Workload Automation components:

From the Red Hat OpenShift Container Platform console, go to Operators -> Installed Operators and click the Workload Automation Operator hyperlink. The IBM Workload Automation Operator instance page is displayed.
Select the YAML tab and in the YAML file that displays, decrease the value of the replicaCount parameter for the component you want to scale down and click Save.

Note:

When you scale down each type of component, the persistent volume (PV) that the storage class created for the pod instance is not deleted to avoid losing data should the scale down not be desired. When you need to perform a subsequent scaling up, new component instances are installed by using the old PVs.

When you scale down server or agent component, the workstation definitions are not removed from the database, so you can manually delete them or set them to ignore to avoid having a non-working workstation in the plan. If you need an immediate change to the plan, run Jnextplan -for 0000 from the master workstation.

Scaling to 0

The Workload Automation Helm Operator does not support automatic scaling to zero. If you want to manually scale the Dynamic Workload Console component to zero, set the value of the replicaCount parameter to zero. To maintain the current Workload Automation scheduling and topology, do not set the replicaCount value for the server and agent components to zero.

Proportional scaling

The Workload Automation Helm Operator does not support proportional scaling.

Single Sign-On (SSO) configuration

By default, SSO configuration is enabled using Json Web Token (JWT). If you maintain the default SSO setting (enableSSO=true), both master domain manager and Dynamic Workload Console are configured in SSO, using JWT. For more information, see Defining Single Sign-On options.

To enable SSO using LTPA keys (enableSSO=false), follow this procedure.

To enable SSO between console and server, LTPA tokens must be the same. The following procedure explains how to create LTPA tokens to be shared between server and console (this procedure must be run only once and not on both systems).

To access the container:

From the command line, log into the OpenShift Enterprise cluster…
Launch the following command:

oc exec -it <server_pod_name> /bin/bash
Create new LTPA token, by launching the following command:

/opt/wautils/wa_create_ltpa_keys.sh -p <keys_password> -o /home/wauser

where:
<keys_password> is the LTPA keys password.
For more information, see Configuring the Dynamic Workload Console.

The “ltpa.keys” and “wa_ltpa.xml” files are created in /home/wauser.

Exit from the container by launching the “exit” command.
Copy the created files to the local machine, by launching the following command:

oc cp <server_pod_name>:/home/wauser/ltpa.keys <host_dir>
oc cp <server_pod_name>:/home/wauser/wa_ltpa.xml <host_dir>

where:
<host_dir> is an existing folder on the local machine where oc runs.

The “ltpa.keys” file must be placed into the secret that stores customized SSL certificates (on both server and console charts); to place it into the secret, launch the following command:

oc apply secret generic <secret_name> --from-file=<host_dir>/ltpa.keys --namespace=<workload-automation-project>
The “wa_ltpa.xml” file must be placed in the ConfigMap that stores all custom liberty configurations (on both server and console charts); to place it into the ConfigMap, launch the following command:

oc apply configmap <configmap_name> --from-file=<host_dir>/wa_ltpa.xml --namespace=<workload-automation-project>

For further details about ConfigMap, see the “Creating ConfigMaps” chapter on the cloud platform documentation.

In both server and console charts, useCustomizedCert property must set to be “true”, the libConfigName and certSecretName properties must be configured with the related name defined in the commands previously launched.

Storage

Storage requirements for the workload

IBM Workload Automation requires persistent storage for each component (server, console and agent) that you deploy to maintain the scheduling workload and topology.

To make all of the configuration and runtime data persistent, the Persistent Volume you specify must be mounted in the following container folder:

/home/wauser

The Pod is based on a StatefulSet. This guarantees that each Persistent Volume is mounted in the same Pod when it is scaled up or down.

For test purposes only, you can configure the chart so that persistence is not used.

IBM Workload Automation can use either dynamic provisioning or static provisioning using a pre-created persistent volume
to allocate storage for each component that you deploy. You can pre-create Persistent Volumes to be bound to the StatefulSet using Label or StorageClass. It is highly recommended to use persistence with dynamic provisioning. In this case, you must have defined your own Dynamic Persistence Provider. IBM Workload Automation supports the following provisioning use cases:

Kubernetes dynamic volume provisioning to create both a persistent volume and a persistent volume claim.
This type of storage uses the default storageClass defined by the Kubernetes admin or by using a custom storageClass which overrides the default. Set the values as follows:
- persistence.enabled:true (default)
- persistence.useDynamicProvisioning:true(default)

Specify a custom storageClassName per volume or leave the value blank to use the default storageClass.

Persistent storage using a predefined PersistentVolume set up prior to the deployment of this chart.
Pre-create a persistent volume. If you configure the label=value pair described in the following Note, then the persistent volume claim is automatically generated by the Operator and bound to the persistent volume you pre-created. Set the global values as follows:
- persistence.enabled:true
- persistence.useDynamicProvisioning:false

Note: By configuring the following two paramters, the persistent volum claim is automatically generated. Ensure that this label=value pair is inserted in the persistent volume you created:

<wa-component>.persistence.dataPVC.selector.label
<wa-component>.persistence.dataPVC.selector.value

Let the Kubernetes binding process select a pre-existing volume based on the accessMode and size. Use selector labels to refine the binding process.

Before you deploy all of the components, you have the opportunity to choose your persistent storage from the available persistent storage options in OpenShift Container Platform that are supported by Workload Automation or, you can leave the default storageClass.
For more information about all of the supported storage classes, see the table in Storage classes static PV and dynamic provisioning.

If you create a storageClass object or use the default one, ensure that you have a sufficient amount of backing storage for your IBM Workload Automation components.
For more information about the required amount of storage you need for each component, see the Resources Required section.

Custom storage class:
Modify the the persistence.dataPVC.storageClassName parameter in the YAML file by specifying the custom storage class name, when you deploy the IBM Workload Automation product components.

Default storage class:
Leave the values for the persistence.dataPVC.storageClassName parameter blank in the YAML file when you deploy the IBM Workload Automation product components.
For more information about the storage parameter values to set in the YAML file, see the tables, Agent parameters, Dynamic Workload Console parameters, and Server parameters (master domain manager).

File system permissions

File system security permissions need to be well known to ensure uid, gid, and supplemental gid requirements can be satisfied.

On OpenShift Container Platform, a random user ID is used. Do not use the RunAsUser user ID because it is overwritten by the OpenShift deployment.
On Kubernetes native, UID 999 is used.

Persistent volume storage access modes

IBM Workload Automation supports only ReadWriteOnce (RWO) access mode. The volume can be mounted as read-write by a single node.

Metrics monitoring

IBM Workload Automation uses Grafana to display performance data related to the product. This data includes metrics related to the server and console application servers (WebSphere Application Server Liberty Base), your workload, your workstations, critical jobs, message queues, the database connection status, and more. Grafana is an open source tool for visualizing application metrics. Metrics provide insight into the state, health, and performance of your deployments and infrastructure. IBM Workload Automation cloud metric monitoring uses an opensource Cloud Native Computing Foundation (CNCF) project called Prometheus. It is particularly useful for collecting time series data that can be easily queried. Prometheus integrates with Grafana to visualize the metrics collected.

The following metrics are collected and available to be visualized in the preconfigured Grafana dashboard named Workload Automation Performance Metrics:

Metric Display Name	Metric Name	Description
Workload	application_wa_JobsInPlanCount_jobs	Workload by job status: WAITING, READY, HELD, BLOCKED, CANCELED, ERROR, RUNNING, SUCCESSFUL, SUPPRESS, UNDECIDED
	application_wa_JobsByWorkstation	Job status by workstation
	application_wa_JobsByFolder_jobs	Job status by folder
	application_wa_JobsInPlanCount_jobs	Workload throughput (jobs/minute)
Critical Jobs	application_wa_criticalJob_incompletePredecessor	Incomplete predecessors
	application_wa_criticalJob_potentialRisk_boolean	Risk level: potential risk
	application_wa_criticalJob_highRisk_boolean	Risk level: high risk
	application_wa_criticalJob_estimateEnd_seconds	Estimated end
	application_wa_criticalJob_confidence_factor	Confidence factor
WA Server - Internal Message Queues	application_wa_msgFileFill_percent	Internal message queue usage for Appserverbox.msg, Courier.msg, mirrorbox.msg, Mailbox.msg, Monbox.msgn, Moncmd.msg, auditbox.msg, clbox.msg, planbox.msg, Intercom.msg, pobox messages, and server.msg
Workstation Status	application_wa_workstation_running	Workstations running
	application_wa_workstation_linked_boolean	Workstations linked
Database Connection Status	application_wa_DB_connected_boolean	1 - connected, 0 - not connected
PODs	kube_pod_container_status_restarts_total	Pod restarts (server, console, and agent)
	kube_pod_status_phase	Failing pods (server, console, and agent)
	container_cpu_usage_seconds_total	POD CPU usage (server, console, and agent)
	container_network_transmit_bytes_total	Network I/O (server and console)
	container_network_receive_bytes_total	Network I/O (server and console)
	container_memory_usage_bytes	RAM usage (server, console, and agent)
Persistent Volumes	kubelet_volume_stats_used_bytes	For server, console, and agent: volume capacity (used, free)
WA Server and Console - Liberty	base_memory_usedHeap_bytes	Heap usage percentage
	vendor_session_activeSessions	Active sessions
	vendor_session_liveSessions	Live sessions
	vendor_threadpool_activeThreads	Active threads
	vendor_threadpool_size	Threadpool size
	base_gc_time_seconds	Time per garbage collection cycle moving average
WA Sever and Console - Connection Pools (Liberty)	vendor_connectionpool_inUseTime_total_seconds	Average time usage per connection over last
	vendor_connectionpool_managedConnections	Managed connections
	vendor_connectionpool_freeConnections	Free connections
	vendor_connectionpool_connectionHandles	Connection handles
	vendor_connectionpool_destroy_total	Created and destroyed connections

The following is an example of the various metrics available with focus on the workload job status:

Job status

The following is an exammple of how persistent volume capacity for the server, console and agent are visualized:

Job status

Viewing the preconfigured dashboard

To get an overview of the cluster health, you can view a selection of metrics on the Workload Automation Performance Metrics predefined dashboard:

From the OpenShift web console, go to Grafana. The Home Dashboard is displayed.
Switch to the Workload Automation organization.
In the left navigation toolbar, click Dashboards.
On the Manage page, select the Workload Workload Automation Performance Metrics dashboard.
On the Workload Automation Performance Metrics page, select the <workload-automation-project> namespace from the pull-down menu.

The dashboard plays. Drill down into the metrics to gain insight into the state, health, and performance of your deployments and infrastructure.

For more information about using Grafana dashboards see Dashboards overview.

Limitations

Limited to amd64 platforms.
Anonymous connections are not permitted.
When sharing Dynamic Workload Console resources, such as tasks, engines, scheduling objects and so on, to groups, ensure the user sharing the resource is a member of the group he is sharing the resource to.

Documentation

To access the complete product documentation library for IBM Workload Automation, see the IBM Knowledge Center.

Troubleshooting

In the event a problem should occur while using IBM Workload Automation, IBM® Customer Support might ask you to supply information about your system and environment to perform problem determination. The following utilities are available:

A general data capture utility command that extracts information about IBM Workload Automation and related agent workstations, system-specific information, and data related to WebSphere Application Server Liberty Base; see Data capture utility.
A first failure data capture (ffdc) facility built into batchman and mailman that automatically runs the data capture utility when failures occur in jobman, mailman, or batchman and collects products logs, traces and configuration files; see First failure data capture (ffdc)
The data capture utility script, wa_pull_info, is also used to collect data related to the Dynamic Workload Console to assist in problem determination; see Data capture utility.
WebSphere Application Server Liberty Base javadump command to create the heap dump for WebSphere Application Server Liberty Base that runs on the Dynamic Workload Console and the master domain manager; see Creating application server dumps.
To perform a backup and restore procedure, see Backup and restore IBM Workload Automation.

In case of problems related to the product, see Troubleshooting.

Known problems

Problem: The broker server cannot be contacted. The Dynamic Workload Broker command line requires additional configuration steps.

Workaround: Perform the following configuration steps to enable the Dynamic Workload Broker command line:

From the machine where you want to use the Dynamic Workload Broker command line, master domain manager (server) or dynamic agent, locate the following file:

/home/wauser/wadata/TDWB_CLI/config/CLIConfig.properties
Modify the values for the fields, keyStore and trustStore, in the CLIConfig.properties file as follows:

keyStore=/home/wauser/wadata/ITA/cpa/ita/cert/TWSClientKeyStoreJKS.jks

trustStore=/home/wauser/wadata/ITA/cpa/ita/cert/TWSClientKeyStoreJKS.jks
Save the changes to the file.

Problem: When you try to access the Console, an error similar to the following is returned: 401 Unauthorized.

Workaround: See the information available at Authentication onboarding and single sign-on.

Change history

Added November 2021

licenseType attribute for managing product licenses (IBM Workload Scheduler only)

Added June 2021 - version

Additional metrics are monitored by Prometheus and made available in the preconfigured Grafana dashboard.
Automation Hub integrations (plug-ins) now automatically installed with the product container deployment
New procedure for installating custom integrations

Added Febbruary 2021 - version 1.4.1

New configurable parameters added to the IBM Workload Automation custom resource for the agent, console, and server components:
- waagent.agent.networkpolicyEgress
- waconsole.console.networkpolicyEgress
- waserver.server.networkpolicyEgress
New optional configurable parameter added to the IBM Workload Automation custom resource for the server component: waserver.server.ftaName which represents the name of the Workload Automation workstation for the installation.
RFE 148080: Provides the capability to constrain a product component pod to run on particular nodes The nodeAffinityRequired parameter has been added to the custom resource configurable parameters for the agent, console, and server components so you can determine on which nodes a component can be deployed using custom labels on nodes and label selectors specified in pods.

IBM Workload Automation

Introduction

Details

Supported Platforms

Accessing the Certified Container Software Images

Prerequisites

Storage classes static PV and dynamic provisioning

PodSecurityPolicy Requirements

SecurityContextConstraints Requirements

Custom SecurityContextConstraints definition:

Resources Required

Required Roles

Installing

Configuration

Global parameters

Agent parameters

Dynamic Workload Console parameters

Server parameters

Configuration

Network enablement

Node affinity Required

Scaling the product

Scaling up

Scaling down

Scaling to 0

Proportional scaling

Single Sign-On (SSO) configuration

Storage

Storage requirements for the workload

File system permissions

Persistent volume storage access modes

Metrics monitoring

Viewing the preconfigured dashboard

Limitations

Documentation

Troubleshooting

Known problems

Change history

Added November 2021

Added June 2021 - version

Added Febbruary 2021 - version 1.4.1