Restarting work on different workstations
When jobs or started tasks are restarted on workstations other
than the workstation that they were initially defined on, you should
consider:
- Data set availability. All the data sets required by the job or started task should be available on the system associated with the alternate workstation.
- The RACF® environment on the system associated with the alternate workstation.
- The initiator structure on the alternate system.
The following points apply to automatic recovery in general:
- Jobs and started tasks should be made (as far as possible) restartable
from the failed step. A major problem is the handling of work files
conveying information from one step to another. One way of dealing
with this is:
- At the beginning of the job or started task, add a special step referring to all work files (IEFBR14) with DISP=(OLD,DELETE,DELETE).
- When you create the file, code DISP=(NEW,CATLG,DELETE).
- When you receive the file and you must pass it to the next step, code DISP=(OLD,PASS,KEEP).
- At the end of the job or started task, add an extra step executing IEFBR14 with a DD statement for each work file specifying DISP=(OLD,DELETE,KEEP).
- Files passed across job or started-task steps, and between jobs or started tasks, must be permanent or intermediate files.
- DISP=MOD should be used with care, because it can cause problems with restart.
- It is better to have small, uncomplicated jobs or started tasks with few steps rather than large, complicated jobs or started tasks with many steps. A complicated process should be broken into several smaller jobs or started tasks.
- Cataloging should be performed in a separate job. This is especially
important when using generation data group (GDG) data sets. Usually, a job runs with input as generation
0 and output as generation +1. On rerun, input should be referenced
as generation -1 and output as generation 0, which would require JCL
changes. An alternative method is to catalog the new generation data
group in a previous job, as shown in the following example:
//A105C01 JOB .... //STEP01 EXEC PGM=IEFBR14 //A105CTLG DD DSN=A105.INVOICE.BASE(+1), // DISP=(NEW,CATLG,DELETE), // UNIT=DISK, // SPACE=(CYL,(5,1)) //* DCB PARMS ARE AVAILABLE IN THE MODEL data set.
The following example refers to the generation data group with the former generation 0 as generation -1 and the new generation as generation 0://A105P01 JOB .... //STEP01 EXEC PGM=A105PGMP //A105PIN DD DSN=A105.INVOICE.BASE(-1), INPUT // DISP=OLD //A105POUT DD DSN=A105.INVOICE.BASE(0), OUTPUT // DISP=OLD
- Do not use backward references. Backward references to data sets in previous steps make restarts more complicated.
- Avoid the use of return codes to control the execution of successive steps under normal conditions because this can lead to restart problems. Use return codes only to bypass step execution after failure, for example, COND=(0,NE).
- Always code a step name for every step (for example, STEPnn, where nn is the step sequence number).