Defining step failure behavior settings
When creating an Automation Plan, for each step in the plan you can define the behavior that occurs if a step in the plan fails. This is known as step failure behavior and is distinct from adding a failure step to a step. You can also set advance failure behaviour. Advanced failure behaviour enables you to specify a period of time after which to fail the step on targets that have not returned a status.
How it works
Use the step failure behavior feature when designing your Automation Plan. The step failure behavior feature provides you with the ability to control the flow of the Automation Plan on endpoints. It gives you the ability to define the behavior that occurs when steps in your Automation Plan fail on some or all endpoints.
The overall step failure behavior is defined by two separate settings. The first setting, step failure mode, defines if the Automation Plan should stop at that point. The second setting, the failed targets behavior, defines if failed targets are included or excluded from subsequent steps. Separately, and regardless of the values that you define for step failure behavior, if a failure step is defined, the failure step is always run before the step failure behavior is processed. After the failure step is run, the system then processes the remaining steps in the plan, based on what you defined in the step failure behavior settings in the step that failed.
To define step failure behavior, first choose whether to continue or stop the Automation Plan. To do this, select an option from the Step Failure Mode list. If you select the option to stop the Automation Plan on step failure, you do not make any further selection. The Automation Plan action is stopped at this point. If you want to continue the Automation Plan, you must decide if you want to continue on all endpoints or only on those endpoints on which the step was successful.
Option | Description |
---|---|
Stop Automation Plan | Select this option to run any associated failure step and then stop the Automation Plan. |
Continue Automation Plan | Select this option to run any associated failure step and then move on to the next step in the Automation Plan. |
Option | Description |
---|---|
Include in Future Steps | The endpoints on which the step failed are included in future steps in the Automation Plan. |
Exclude from Future Steps | The Automation Plan continues on the endpoints on which the step was successful. The endpoints on which the step failed are removed from the future steps. |
For Automation Plans created
previously, default values are implemented. The default values are Stop
Automation Plan
. When you open a legacy Automation Plan and
save it, the new attributes are added to the saved Automation Plan.
Step failure behavior and failure step targeting
Step failure behavior targeting
is different from failure step targeting.
When you add a failure step to
an Automation Plan,
you can apply that failure step to
all endpoints targeted
in the step,
or to only the endpoints on
which the step failed.
If you add a failure step to
a step and
set the targeting for that failure step to
apply to all endpoints,
this targeting might be superseded if you have defined step failure behavior settings.
If you have step failure mode defined
as Continue Automation Plan
and Exclude from
Future Steps
, any associated failure step targeting
is set automatically to Failed Only
. The reason for
this is that you do not want to run the failure step against endpoints on
which you want to run future steps, as this is defined in the step failure behavior settings.
Instead, you want to run the failure step only
on the endpoints that
will be omitted from future steps.
Tracking Automation Plan actions and step failure behavior
You can view Automation Plan actions and step actions on
the Automation Plan Action Status dashboard.
If a step in
your Automation Plan fails,
the failure is indicated in the Status column.
For steps that fail and do not have step failure behavior behavior
defined, a status of Failed
is displayed in the Status column.
For steps that
fail and have step failure behavior defined,
a status of Failed on some targets
is displayed,
identifying that the step has step failure behavior defined
and has failed on some targets. Therefore, steps that
have a status of Failed
are steps that have failed.
In this case, the Automation Plan runs
any associated failure step and
then stops. Steps that have a status of Failed on some targets
are steps that
have failed on some targets but the Automation Plan continues
to run, according to the settings defined by the step failure behavior.
The Automation Plan continues
to run on all endpoints or
only on the endpoints on
which the step was successful.
To view the endpoints on which the step failed, click the Detail icon for the particular step.
Advanced failure behaviour
Advanced failure behaviour enables you to design your Automation Plan to run within scheduled maintenance windows by allowing you to specify a time limit for steps to complete on target endpoints. This enables you to control how steps complete and to fail steps on endpoints on which the step has not completed after a period of time that you specify. For example, if you have a maintenance window of 60 minutes and need to include three steps in your plan, you can enable advanced failure behaviour and enter a time period of say 20 minutes for each step. When a step runs and if 20 minutes elapse and the step has not completed on some endpoints, the step is then failed on those endpoints. The advanced failure behaviour settings are disabled by default.
- If you are creating your plan, click the Default Settings icon for the step and go to the Execution tab. Enable the check box for Fail incomplete targets and enter a period of time, in minutes after which you want to fail the step on any endpoints on which the step has not completed.
- If you are running your plan, from the Take Automation Plan Action screen, click the Execution tab and enable the check box for Fail incomplete targets and enter a period of time, in minutes, after which you want to fail the step on any endpoints on which the step has not completed.
Setting step failure threshold
The step failure
threshold enables you to manage the success or failure of the step
based on the percentage success and failure rate of the step on the
total number of target endpoints. Setting a step failure
threshold allows you to specify the percentage of failing targets
that determines the success or failure of the step. For
example, if you set the Step Failure Threshold at more
than 5%
and if the step fails on more than 5% of targeted
endpoints, the step is treated as a failed step. If you have set a
failure step, the failure step will then be executed. If you set the Step
Failure Threshold at more than 5%
and
the step fails on 5% or fewer of targeted endpoints, the step will
be treated as successful and a failure step, if set, is not executed.
- Open the Automation Plan that contains the step for which you want to configure the step failure threshold and click Edit.
- Select the step for which you want to configure the step failure threshold.
- Click the Default Settings icon for the step and go to the Execution tab.
- In the Step Failure Threshold section,
enter a percentage value for the threshold at which to fail the step.
For example, if you enter
more than 25%
, the step will be failed if the step is unsuccessful on more than 25% of endpoints targeted. If the step is unsuccessful on 25% of endpoints, the step is treated as successful. The default value is any which means that if any endpoint fails the step, the step is treated as a failure and if you have defined a failure step for the step, the failure step will be executed. Click OK and then repeat this process for each step for which you want to configure a step failure threshold.
Pending Restart step actions and step failure behavior
When target endpoints report a Pending Restart
status for a step action, the
system does not automatically stop those Pending Restart
step actions. If the
Pending Restart
step action was stopped, it would prevent the step action from
being updated with the actual result of the action after the restart completed. Instead, the
Pending Restart
step action remains in an Open
state, allowing any
Restart Endpoint and Wait for Restart to Complete
step added to update its status
after the restart completes. This enables you to get the actual outcome of the step action once it
becomes known.
This becomes more complex if the step that requires the restart fails. Steps in a Pending
Restart
state can fail if the step times out or if one or more endpoints report a failure
status. If a step in a Pending Restart
state fails, the Step failure behavior
becomes more complex because the Pending Restart
step action remains open. Here are
two examples that illustrate how the Pending Restart
state works with the step
failure behavior settings.
Scenario 1: Failed Pending Restart Step and failure behavior set to Stop plan
In this scenario, step failure behavior is set to Stop Plan. A step fails but some
endpoints report back a Pending Restart
status. The step that fails has a failure
step set. The failure step is Restart Endpoints and Wait
step. The Pending
Restart
is then processed as follows:
- The system leaves the failed step in an
Open
state and runs the failure step. - The targets that are in a
Pending Restart
state eventually report back with the actual result of the step action. - The system then stops all actions - for the failed step, the failure step, and the plan.
Scenario 2: Failed Pending Restart Step and failure behavior set to Continue Plan
In this scenario, step failure behavior is set to Continue Plan. A step fails but some
endpoints report back a Pending Restart
status. The step that fails does not have a
failure step set. The Pending Restart
is processed as follows:
- The system leaves the failed step in an
Open
state and executes the subsequent step. The next step in the plan is aRestart Endpoints and Wait
step. - Next, the
Pending Restart
targets of the failed step report back the actual status of the step action. - The system then stops the
Restart Endpoints and Wait
step and processes the remaining steps in the plan. - Last, when all steps have been processed, the system stops all remaining open step actions, including the action for the failed step, and then stops the plan action.