bmap4j - Batch Management And Processing For Java

xinventa logo

Batch-Transaction-Processing Fundamentals - Robustness

de

Robustness

Batch programs, if built accordingly, are characterized as robust, if the data consistency remains intact despite of intrinsic or extrinsic malfunctions or breakdown.

Intrinsic malfunctions may occur if there are program errors or non-availability of application components. Possible reasons for extrinsic malfunctions are errors in communication, or failures of the server or even a whole data center.

Robustness on a micro level can be achieved by transactionality, by ensuring data consistency on an atomic level.

Robustness on a macro level describes the capability of a batch program to accomplish a given task completely and correctly. This will essentially be determined by the option to restart the program and resume work in case of problems.

The following characteristics are therefore important:

Idempotence

An important program characteristic in the area of robustness is the so called idempotence. This describes the program's ability to be restarted after an error situation, resume work at the correct position and to complete successfully. It is recommended for batch-programs to be implemented idempotently, whenever possible.

Idempotence is directly related to the data needing to be processed. Depending on the strength of the support of an idempotent processing by the functional data, we can differentiate between three types of idempotence.

  • Natural Idempotence : The processing information is contained directly within the functional data. In case of a restart the work will be resumed automatically at exactly the right position.
  • Derivated Idempotence : Functional data are enhanced by processing information. In case of a restart this processing information is analyzed by a functional code and the processing can be resumed at the correct position.
  • Assisted Idempotence : The batch-system logs the processing in cooperation with the functional code. In case of a restart, the batch-system will inform the functional code about the processing status, which in turn resumes processing at the correct position.

All-or-Nothing Strategy

The "all-or-nothing strategy" is known as a special case of idempotence. By cleverly selected processing & transactionality, either the whole batch is processed correctly or in case of an error nothing at all. If a restart occurs, the whole batch has to be repeated.

There is different options to implement a program with an "all-or-nothing" strategy, e.g. "single transaction per batch" or "one transaction per slice", which we will not explain in detail at this point.

Error recovery

Unfortunately, in reality job breakdowns and subsequent restarts cannot be completely eliminated. Depending on the batch program, the recovery can be more or less complex. If the program was implemented idempotently, the recovery can occur automatically, simply by restarting the program with the same parameters.

If an automatic restart is not possible, a manual restart of a job has to be implemented. To achieve this, the processing status has to be determined using the job protocol. Possible job artifacts of the aborted job will be deleted or reset, e.g. message queues, print output, temporary DB tables. Afterwards, a new job is scheduled using the program parameter in a way that allows a restart at the correct position.

A manual restart as strategy for a BTP platform should be avoided whenever possible, as this is usually rather complex. If this is, however, for whatever reasons not possible, operations should at least be provided with the appropriate tools to reset the job. In case of errors, you can also support operations by automatically restarting clean-up jobs by the enterprise scheduler, which take care of part of the "tidy-up" work.