CICS uses an ESTAE or ESTAEX to establish its general purpose recovery environment. One of the performance implications of this design is that on every program check, before the CICS ESTAE exit routine is driven, RTM2 will do a systrace snapshot (which means it copies and saves off the systrace buffers of all processors) in anticipation of a dump being requested within the ESTAE exit. Depending on how many processors there are and how big the systrace buffers are, there can be a considerable amount of CPU used and wall-clock time spent doing that systrace snapshot. (There is 1 systrace buffer per processor.)
The CICS ESTAE exit does not decide whether or not to take a dump. The CICS ESTAE exit first does a retry, and then later the decision is made by other routines whether or not to take a dump. At the point of the retry, the systrace snapshot is thrown away by RTM2. If CICS decides to take a dump, a 2nd systrace snapshot is done during dump processing. If CICS decides not to take a dump, there has already been 1 unused systrace snapshot.
.
Sometimes there can be multiple program checks in a CICS region that happen one after the other and do not result in a dump. The resulting string of systrace snapshots can cause considerable performance impacts in terms of CPU charged to the running CICS task, the TCB not being available to other waiting CICS tasks, and other jobs in the LPAR being slowed down by contention with the systrace snapshot processing.
One example of a string of program checks where no dump is taken is when Language Environment (LE) is writing out abend information for a transaction abend. Most settings of LE runtime option TERMTHDACT will cause LE to attempt to write out the storage pointed to by each abend register as well as each register saved at each stack level. Often, lots of those registers and saved registers do not point to addressable storage. So this processing sometimes results in 10s or even 100s of program checks per transaction abend. These program checks are normal, and they are not externalized, and they are not a problem. But because of the systrace snapshot being done for each one, there are significant performance implications.
.
Another example of a string of program checks where no dump is taken is when DFHAP0001 and DFHSR0001 messages are set to suppress dumps. A test region could be stuck in a loop of program checks resulting in DFHAP0001 messages and no dumps. Even though no dumps are being taken, each program check results in a Systrace Snapshot. Because of that, a string of program checks in a test region could impact processing in other jobs and other CICS regions in the LPAR especially if those other jobs are trying to take a system dump.
An ESPIE and SETFRR establish recovery environments such that Systrace Snapshot is not done before the recovery exit is driven.
In order to reduce the significant overhead of program check processing, CICS should utilize a recovery design for program checks that avoids systrace snapshot processing until a dump is taken.
Due to processing by IBM, this request was reassigned to have the following updated attributes:
Brand - Servers and Systems Software
Product family - Transaction Processing
Product - CICS Transaction Server
For recording keeping, the previous attributes were:
Brand - WebSphere
Product family - Transaction Processing
Product - CICS Transaction Server
This is not a trivial change in an area of CICS that needs to be as error free as possible. SETFRR needs to be authorised so is probably a no no. ESPIE is possible (maybe). It could handle all program checks but we would still need to be able to handle recovery from MVS abends etc . Looking at current plans, it is not likely that this would be implemented in the next two CICS TS releases, so correspondingly this requirement is being rejected. You have an opportunity to resubmit in eighteen months time if you wish it to be considered then.