Troubleshooting HERO

Symptom

The KPI dashboard is not displayed.

Cause and solution

Verify that the Kibana service is up and that the browser can reach it.

Symptom

Queue data is not shown.

Cause and solution

Verify that ElasticSearch is up and reachable from the workstation.

Symptom

Environment and workstations are not displayed.

Cause and solution

Verify that the HERO server is up and that the connection to derby instance is working fine.

Symptom

The discovery of a server fails, even with correct user and password.

Cause and solution

Check that the user can run SSH on the workstation. From HERO server, run the command:

ssh <server_to_discover>

and verify that it works.

Check if the following message is logged in the Tomcat output: ERROR SSHConnection:469 - com.jcraft.jsch.JSchException: 4: Received message is too long.

In this case, the issue is related to the output written by the .bashrc script. Move any script or instruction that writes output to the .bash_profile.

If using dockerized version of WA, check that user can create a docker image. If not, add the user to the “docker” group.

If you are discovering a Windows machine, check that all the prerequisites are met. Check that Pyhton has the correct version and is 64-bit.

Symptom

A machine cannot be retrieved.

Cause and solution

If a machine cannot be retrieved, run the following actions:

Verify that the connection between HERO and the machine is up and no firewall block is active. To verify the connection, use the ping and/or ssh command from the HERO machine.
Verify that the machine to be retrieved has Workload Automation installed and compatible with HERO version.
Verify that the machine to be retrieve is up and running.
Remove any login welcome message from the .bashrc file.

Symptom

HERO is not starting.

Cause and solution

If the HERO server (Dashboard.war) is not starting, check the following conditions:

If the following line shows up in the log:

springSecurityFilterChain' threw exception; nested exception is java.lang.NoClassDefFoundError: javax/xml/bind/JAXBException,

it means that you are trying to start HERO server with Oracle Java/OpenJDK version 9 or 10, which is not supported. You must use either OpenJDK 8 or Oracle Java 8.

If you installed HERO on a SELinux machine such as RHEL or CentOS, make sure you set the hostname to the Fully Qualified Domain Name of the machine. To identify it, run the command hostname --fqdn.

Symptom

ElasticSearch does not start, and the log shows the following error:

2018-09-21T14:11:16,039][INFO ][o.e.b.BootstrapChecks ] [qVAFkOU] bound or publishing to a non-loopback address, enforcing bootstrap checks

ERROR: [2] bootstrap checks failed

[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]

[2]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

Cause and solution

To solve this problem, run the following procedure.

How to verify and set ulimit parameter:

Check the maximum number of open files for the current user by running the command ulimit -n
Verify that the number of allowed open files for the current user is at least 65536
Check the Hard limit for the current user, by running the command ulimit -n -H
Check the Soft limit for the current user, by running the command ulimit -n -S
In case the value of Hard or Soft limit is lower than 65536, increase its value, by editing the file:

/etc/security/limits.conf

[domain] [type] [item] [value]

where:

[domain] can be a username, a group name, or a wildcard entry
[type] is the type of the limit and can have the following values:

soft: a soft limit which can be changed by user
hard: a cap on soft limit set by super user and enforced by kernel

[item] is the resource for which you are setting the limit

For example, for a user with id hmuser run the following steps:

Add or modify soft and hard limits as follow:

hmuser soft nofile 65536
hmuser hard nofile 65536

Activate the new values by running the following command sysctl -p
Update the following files:

/etc/systemd/user.conf
/etc/systemd/system.conf

by adding the following line:

DefaultLimitNOFILE=65536

Symptom

The log shows the following error: [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144].

Cause and solution

On Linux, run the following procedure.

How to verify and set the available virtual memory

To verify the available virtual memory, run the following command as the user that started the Docker deamon:

sysctl vm.max_map_count

If the command output shows a value lower than 262144, run the command:

sysctl -w vm.max_map_count=262144

To set this value permanently, edit the vm.max_map_count setting in /etc/sysctl.conf.
Add the following as last row, or edit the row if present:

vm.max_map_count=262144

Verify the new value after reboot.

Symptom

Keycloak container restarts, or the log shows the following error:

"Security-Enhanced Linux (SELinux) on the hypervisor must be disabled or the permissions must be set up correctly".

Cause and solution

To set SELinux to permissive, run the following procedure.

How to set SELinux to permissive

Run the following commands:

sed -i s/^SELINUX=.*$/SELINUX=permissive/ /etc/selinux/config

setenforce 0

sed -i s/^SELINUX=.*$/SELINUX=disabled/ /etc/selinux/config

Restart the system to save the changes permanently.
After you restart the system, you can use the getenforce command to check the SELinux status.

Symptom

Kibana is not creating the default index pattern.

Cause and solution

Delete the Kibana index (by putting the command “DELETE .kibana” inside Kibana dev tools) and try to create the index pattern again.

Symptom

The discovery process shows the following warning message: The machine has been discovered, with some warnings - No such file: It was not possible to establish an http callback via CURL command. The monitors will not work. Check that is possible to send an http/https request from the machine you are discovering to HERO server.

Cause and solution

If you receive the following command output: curl: (2) Failed initialization, the problem is related to the configuration of LD_LIBRARY_PATH variable.

Correct the LD_LIBRARY PATH so that the curl command works correctly. For example, you might need to change the .bashrc and include system libraries path in that library.

Symptom

Monitors do not activate.

Cause and solution

If the monitors do not activate, it means that the monitored server is not connecting back to HERO.

Manually connect to the remote machine being monitored via ssh client and run the monitors manually. Monitors are located under deployPath (usually userHome\<ip/hostname>\HERO\). Check for any error.

Be sure HERO server is reacheable, https port is opened, and curl is working. The https is configurable, so it is the one specified at installation time.

Symptom

While creating Kibana index pattern to retrieve data from Elasticsearch, the following warning message is displayed: Warning - No default index pattern. You must select or create one to continue.

Cause and solution

To solve this problem, run the following CURL commands on the machine that hosts HERO docker installation:

docker exec hero-tomcat curl -k -XDELETE "http://elasticsearch:9200/.kibana" -H 'Content-Type: application/json'
docker exec hero-tomcat curl -k -XPOST -H "kbn-xsrf: reporting" "http://kibana:5601/api/saved_objects/index-pattern" -H 'Content-Type: application/json' -d' {\"attributes\":{\"title\":\"run*\"}}'
docker exec hero-tomcat curl -k -XPOST -H "kbn-xsrf: reporting" "http://kibana:5601/api/kibana/settings/defaultIndex" -H 'Content-Type: application/json' -d' {\"value\":\"run*\"}'

Symptom

Failed to fetch http://archive.ubuntu.com/ubuntu/dists/bionic/InRelease Temporary failure resolving 'archive.ubuntu.com'.

Cause and solution

To solve this problem, run the following steps:

edit /etc/default/docker and add the following line:

DOCKER_OPTS="--dns <your_dns_server_1> --dns <your_dns_server_2>"

You can add as many DNS servers as you want to this config.

Once saved, restart your Docker service:

sudo service docker restart

Symptom

During HERO online installation, Docker fails to download HERO images from repository.

Cause and solution

If you are using a proxy connection to the internet, the same proxy must be configured in Docker. For details, see: Docker documentation.

Symptom

HERO Web page keeps reloading or blinking.

Cause and solution

Check if keycloak or hero-nginx container log files show "host unreachable" or "unknown host" error messages. The issue might have different causes and solutions:

If HERO server is using a proxy connection to the internet, you must configure the same proxy in Docker. For details, see: Docker documentation.
Add <hero external port> to your OS firewall. For instance, for RHEL OS run the following commands:

sudo firewall-cmd --zone=public --add-port=<hero external port>/tcp --permanent
sudo firewall-cmd --reload

Symptom

HERO UI login and logout repeatedly

Cause and solution

Ensure that the secret key value is the same in the Keycloak UI, in the ui.properties file and in the .tomcat.env (<HERO_HOME>/CONFIGURATION/HERO) file.

In case of mismatch:

Get the secret key value from Keycloak UI (Client > nginx > Credentials (tab) > Secret)
Update the ui.properties file.
Remove the configuration volume by running: docker volume rm <BUILD_DIR>_hero-home
Run docker-compose up --build -d

Symptom

Alerts by email are not sent and docker logs the message "Hero-tomcat shows app security exceptions from a gmail account".

Cause and solution

The mail account must be properly configured. See Google account help: Sign in with App Passwords.

Symptom

The Final job stream monitor discovers an error even if the Final job stream correctly started at the scheduled time.

Cause and solution

Check if HERO time zone (default is UTC) is different from the WA server time zone.

If it is different, run the following steps:

Update HERO time zone by changing the docker-compose.yml file. For details, see step 6 of HERO Installation procedure.
From <BUILD_DIR> directory, run the command:

docker-compose up -d –build