As previously discussed, we're not able to fully exploit CICS/CPSM dynamic routing capabilities because of one fundamental flaw (as we see it): CICS/CPSM will continue to commit workload to AORs whilst all appears well - sessions are free; CICS regions have task slots; they are not SoS; they are not suffering abends or dumping; response times are within acceptable limits etc...
Only when regions reach MAXT or go SoS etc... that we see CPSM start to hold off.
Whilst this general philosophy is sound when all is working well, it may not be appropriate for CICS/CPSM to behave in this manner - i.e. keep committing workload to an application simply because it can.
At the end of the day, a sysplex is a complex amalgum of various applications that have been sized, have had capacity plans submitted and the resulting machine is sized to cope with the capacity planned peak peak workload.
If, becasue of issues around response times, we need to increase the number of trget AORs simply to get the peak workload through within the SLA, unless we monitor what workload has already been committed to the application (and note I keep using the word 'application' here) then there's a danger that we will starve the other applications of resource sharing that sysplex (which have already bought and paid for a certain amount of capacity).
I am not worried about undersized deployments i.e. not enough capacity is laid down to support the peak workload for a given application - in effect the business are paying over the odds for what they have been given.
But what I am worried about is these topologies where we have to throw CICS regions at an app simply because its internal response times can be so slow.
I want to therefore be able to cap the committed workload sent to an application's AORs.
What do I mean by an application? An application is a collection of services. A service is a collection of operations. An operation is a unitary element in that hierarchy - it should represent an executable (txn and/or program). Services and Operations can be versioned, BTW.
I then need topology information related to that application. Typically this will be a composite Group of groups. For example, I may have APPX_CICS A & B on LPAR 1; APPX_CICS C & D on LPAR 2. I'd like to group CICS A & B into a Group - call it APPX_LPAR1_Group and C&D into APPX_LPAR2_Group. I would then like to create a group called APPX_Group - comprised of the two LPAR groups.
I can then take a versioned service.operation, work out its owning application and equate that to a routing target:
- the Group of Groups?
- the LPAR Group?
- the individual CICS?
In 24x7 deployments, we will always target the Group of Groups and let DTR decide which LPAR to route to (we'd prefer to avoid XCF if possible) and within each LPAR, which CICS to route to.
Bye the way, I can have two or more operations (Application. Service.Operation.A and Application.Service.Operation.B) both pointing at the same txn and program. In SOA terms we don't see a versioned Service.Operation having to have its own discrete CICS txn or program. It may well do but we are not prescribing that.
See the use case below. I think that will illustrate the point I am getting at.
This problem usually manifests itself with applications that need lots of CICS because internal response times are lousy.
. ; Take an example:
Due to processing by IBM, this request was reassigned to have the following updated attributes:
Brand - Servers and Systems Software
Product family - Transaction Processing
Product - CICS Transaction Server
For recording keeping, the previous attributes were:
Brand - WebSphere
Product family - Transaction Processing
Product - CICS Transaction Server
We have no plans to address this. Currently, WLM has no embedded function to restrict workload throughput. To implement this would not be a simple task.