Skip to Main Content
IBM Z Software


This portal is to open public enhancement requests against IBM Z Software products. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).


Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:

Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,

Post your ideas
  1. Post an idea.

  2. Get feedback from the IBM team and other customers to refine your idea.

  3. Follow the idea through the IBM Ideas process.


Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

Status Delivered
Categories z/TPF
Created by Guest
Created on Dec 17, 2019

Add DASD device level metrics to Continuous Data Collection (CDC)

Current state:
CDC exports very limited metrics related to the DASD subsystem.
- I/O service times are summarized (averaged) at the SSID level, which includes 32 devices on our systems but may include as many as 64 devices.
o Because of this we can't see the performance of any particular single disk MOD
o Bad performance of a single MOD can be masked by the averaging of the service time over many devices
- DASD queue metrics are sampled with counts for only the top 9 devices being reported in any interval.
o In our system that means we have over 99% of the MOD's with unreported values.
o The sparse nature of the data for any single MOD makes it impossible to build reliable predictive metrics related to queuing
- I/O counts are summarized at the DASD device level (DEVA, DEVB,…) so we have no visibility to how much I/O any particular MOD is supporting

Because the methods of data gathering above are not consistent we don't have I/O, queue and service time data which can be correlated directly in a given sampling interval. We also do not have enough data to accurately analyze some apparent problems in the DASD subsystem so we resort to ‘homegrown' solutions to gather the data and perform tedious manual analysis.

Since we run loosely-coupled systems the DASD subsystem performance must be predictable and reliable. We have seen DASD performance issues cause contention and be the root cause of system outages.

Desired state:
Capture and export the data needed to calculate service time, I/O rate, and disk queue status for every DASD device.
- This will allow us to do the following:
o Build predictive models which describe the expected behavior of the DASD subsystem at all times of the day
o Identify the source (down to a specific device) of performance abnormalities in near real-time
o Monitor the system performance in near real-time to ensure the actual performance is tracking with predicted/expected performance
 This then enables us to take preemptive measures if we see a trend toward unexpected performance or automate reactions to sudden performance issues
The actual calculations do not need to be done in TPF but the data exported should be complete enough to allow an offline process to perform the calculations quickly.

Idea priority Medium
  • Guest
    Reply
    |
    Jun 17, 2020

    This is available with z/TPF APAR PJ46077.