CICS Monitoring
zOSMON as a feature of zVPS provides real time information, daily reporting, and long term trend capabilities. This page is to provide direction on how to use zOSMON to anlayze CICS transaction response times.
Response time reporting is done at several levels. One measure is total transactions for the CICS region. Another method is reporting by transaction names. For response time, we evaluate the terminated transactions. There is another class of transactions, the long term, or "inflight". Though these do not have a response time, their performance may be critical, and the resources consumed by inflight transactions may impact others or be impacted by others.
Because the overhead of the zVPS analysis is so extremely low, it is safe to monitor performance continously. The concept only monitoring when there is a problem implies diagnostics, where as if one always captures the data, any problem can be diagnosed from existing data rather than having to start a diagnostics tool and hope the problem re-occurs. zVPS focuses on management and ensuring all data is available for performance diagnostics, capacity planning and other parts of performance management such as operational alerts.
zVPS includes a Full Performance Database meaning that all data is captured for long term or post process analysis. The PDB is accessable real time through zVIEW and zMON, or as a batch process using zMAP.
zOSMON provides several CICS reports breaking down the different components of reponse time. Starting at a high level, the following is included on zoscix2/3/4. Response time has two major components:
Dispatch time includes CPU time and for each transaction type CPU time should be close to static for a transaction ID. Dispatch time also includes time to load the program, "PC Load" time. If programs are pre-loaded, this should be zero.
High level gross response time breakout is found on ESACIX2. Response time is broken up into dispatch time and suspend time. Dispatch time includes CPU time, and ZIP time. Suspend time has many components. In the following case, showing total terminated transactions for 3 different regions, V62C show much higher response time and that shows up in both suspend time and dispatch time. Plotting response time over long periods will be important to know what is normal and what is acceptable.
Report: ZOSCIX2 z/OS Region Transaction Analysis Monitor initialized: 10/20/20 at 17:56:00 on Z15S serial 0E0F78 ---------------------------------------------------------------- Time/<-Response Time-> SYSID/ APPLID Group Count Total Susp Disp CPU DISP ZIP Time Resp Time Time Time Wait CPU ---- ------- -------- ----- ----- ----- ----- ----- ----- --- 10/20/20 17:56:00 - 17:57:00 V24A C24ASTND Totals 688 0.016 0.006 0.010 0.007 0.001 0 V24B C24BSTND Totals 992 0.025 0.014 0.011 0.008 0.001 0 V62C CICSJA69 Totals 440 0.618 0.510 0.108 0.021 0 0
Suspend time includes dispatch delay as well as other components. At a high level, dispatch delay includes time for first dispatch, I/O wait and a large "other" bucket. There are always unaccounted time as well.
Report: ZOSCIX2 z/OS Regi Velocity Software Corporate ZMAP 5.1. Monitor initialized: 10/20/20 ------------------------------------------------------------------------- Time/<-Total-> <------Dispatch Delays--------> SYSID/ APPLID Group Count <--CPU--> <-------milliseconds----------> Time Secs Pct 1StDisp I/O Wait Other Unknown ---- ------- -------- ----- ---- ---- ------- -------- ----- ------- 10/20/20 17:56:00 - 17:57:00 V24A C24ASTND Totals 688 4.86 8.11 0.005 0.001 0.000 0 V24B C24BSTND Totals 992 8.27 . 0.012 0.000 0.001 0 V62C CICSJA69 Totals 440 9.26 . 0.510 0 0 0
Suspend time is the total time the transaction was suspended and includes
Part of performance management is operational alerts. Using zALERT, you will want to implement alerts on items such as queue levels, active regions, active terminals, and the number and type of conversational transactions.
With low overhead of processing the transaction and statistics
records, zVPS normally collects and reports data on
a one minute intervals that is aligned
to the minute. During statistics gathering, there is overhead
associated with that inside CICS. Traditionally, this is suspected
to create a performance impact during this very short time frame
of statistics gathering.
If the statistics interval
is 60 seconds, this will impact every 60 seconds, if the interval
is 15 minutes, then this will impact every 900 seconds. One of
the experiments we performed was to show response time second by second.
To run your own benchmark on all CICS in an LPAR, issue the
command
"SMSG ZSERVE SET BENCHMARK sysid jobname"
That will produce one
record every second for 5 minutes showing response time of all
transactions completing in that second. This allows
analysis of the impact of the statistics gathering at the top of the
minute. It also allows quick determination of dyanmic tuning
options or if other workload changes are effective or detrimental
In benchmark mode, zwrite will create a "BENCHMARK FILE" on the ZWRITE 191 disk. The columns are transaction count, Response time, (usrdisp+susptime), usrdispt, susptime, usrcput.
Bench: 17:27:17 VSI1 16 0.008 0.008 0.005 0.003 0.004 Bench: 17:27:18 VSI1 40 0.006 0.006 0.006 0.001 0.005 Bench: 17:27:19 VSI1 87 0.017 0.017 0.006 0.011 0.006 Bench: 17:27:20 VSI1 120 0.036 0.036 0.004 0.032 0.004 Bench: 17:27:21 VSI1 95 0.002 0.002 0.002 0.000 0.002 Bench: 17:27:22 VSI1 87 0.004 0.004 0.003 0.000 0.003 Bench: 17:27:23 VSI1 96 0.004 0.004 0.004 0.001 0.003 Bench: 17:27:24 VSI1 112 0.006 0.006 0.004 0.002 0.004 Bench: 17:27:25 VSI1 64 0.005 0.005 0.004 0.001 0.004
This capability allows immediate analysis of any dynamic tuning performed or to understand impact of other actions.
When there is performance issues, one common question is "what changed". One of the possible changes is transaction loads. If one transaction suddenly ramps up unexpectatdly or in error, this will change performance characteristics. To understand the quantity of transactions by transaction, look at the ZOSCIX3 to see the load by transaction name.
Report: ZOSCIX3 z/OS CICS User Transacti Monitor initialized: 10/20/20 at 17:56:00 on --------------------------------------------- Time/ <--Transaction->SYSID Rate Total Susp Disp APPLID ID Count /Sec Resp Time Time -------- ---- ----- ---- ----- ----- ----- 10/20/20 17:56:00 - 17:57:00 V24A C24ASTND Total 688 11.5 0.016 0.006 0.010 CSAC 55 0.92 0.003 0.001 0.002 STRH 31 0.52 0.059 0.018 0.041 STR1 29 0.48 0.007 0.003 0.004 STR2 28 0.47 0.007 0.000 0.007 STR3 22 0.37 0.013 0.001 0.012 STR4 23 0.38 0.014 0.001 0.014 STR5 21 0.35 0.028 0.005 0.023 STR6 24 0.40 0.021 0.002 0.019 STR7 22 0.37 0.025 0.005 0.020 STR8 25 0.42 0.031 0.007 0.023 STR9 35 0.58 0.047 0.017 0.030 ZIPC 113 1.88 0.006 0.004 0.001 ZIPU 118 1.97 0.004 0.002 0.002 ZPHG 71 1.18 0.014 0.007 0.007 ZPHP 71 1.18 0.022 0.016 0.006
CPU analysis has several different aspects. One can start by understanding the LPAR requirements by looking at the ZOSCPU report. In this case there are two general purpose engines and one zIIP that is not being used.
Report: ZOSCPU Z/OS CPU Report Monitor initialized: 10/20/20 at 17:56:00 on -------------------------------------------- TIME/ <--CPU--> Sample <-CPU Utilization> SYSID ID Type Count Total Wait Parked ------ ---- ---- ----- ----- ----- ------ 17:57:00 - 17:58:00 V24A Tot GP 2 22.6 142.2 0 0 GP 1 12.9 68.0 0 1 GP 1 9.7 74.3 0 2 zIIP 1 0.1 100.0 0
After looking at the high level CPU, one can look at the specific CICS region on the ZOSJCPU report.
Report: ZOSJCPU z/OS Job/Step CPU/Resource Report Monitor initialized: 10/20/20 at 17:56:00 on Z15S serial 0E0F78 ------------------------------------------------------------------ SYSID <------JOB----------> <-------CPU Percents----------------> Name JobID Step Initiat Regn Nbr Total STD SRB TCB SRB I/O Cntrl USS ---- -------- -------- --- ----- ---- ---- --- --- --- ----- --- 10/20/20 17:56:00 - 17:57:00 V24A Totals . 19.5 16.0 3.2 0 0 0.3 0 0 CATALOG CATALOG 1 0 0 0 0 0 0 0 0 C24ASTND STC04414 1 13.0 12.4 0.6 0 0 0 0 0 GPMSERVE STC03424 1 0.1 0.1 0 0 0 0 0 0 IXGLOGR IXGLOGR 1 0.0 0 0.0 0 0 0 0 0 IZUSVR1 STC03413 3 0.8 0.8 0.0 0 0 0 0 0 JESXCF JESXCF 1 0.0 0.0 0 0 0 0 0 0 JES2 JES2 1 0.3 0.3 0.0 0 0 0 0 0 JES2MON JES2MON 1 0.3 0.1 0.1 0 0 0 0 0 MQA1CHIN STC03890 1 0.0 0.0 0.0 0 0 0 0 0 MQA1MSTR STC03453 1 0.3 0.1 0.1 0 0 0.0 0 0 MQA2CHIN STC03937 1 0.0 0.0 0.0 0 0 0 0 0 MQA2MSTR STC03456 1 0.0 0.0 0.0 0 0 0 0 0 MSTJCL00 MSTR 1 0.1 0.1 0.0 0 0 0 0 0 OMVS OMVS 1 0.0 0.0 0 0 0 0 0 0 RMF STC03391 1 0.1 0.1 0.0 0 0 0 0 0 SDSF STC03387 1 0.0 0.0 0.0 0 0 0 0 0 SDSFAUX STC03407 1 0.4 0.4 0.1 0 0 0 0 0 SMF SMF 1 0.0 0.0 0 0 0 0 0 0 SMS SMS 1 0.0 0.0 0.0 0 0 0 0 0 TCPIP STC03385 1 0.9 0.2 0.7 0 0 0 0 0 TN3270 STC03421 1 1.4 0.1 1.3 0 0 0 0 0 WLM WLM 1 0.3 0.2 0.0 0 0 0 0 0 XCFAS XCFAS 1 0.1 0.1 0.0 0 0 0 0 0 ZFS STC03381 1 0.2 0.2 0.0 0 0 0 0 0 ZOSMNVM2 STC03419 1 0.3 0.3 0 0 0 0 0 0 ZOSMNVM4 STC03420 1 0.3 0.3 0.0 0 0 0 0 0
And we can see at this point, the CICS region is using 13% of one GP processor.
To take that analsys inside CICS, the totals for all the transactions that do not include CICS overhead shows as 4.9 CPU seconds during the interval, or 8.1% of one GP CPU.
Report: ZOSCIX2 z/OS Region Transaction Analysis Veloc Monitor initialized: 10/20/20 at 17:56:00 on Z15S serial 0E0F78 ------------------------------------------------------------------------ Time/<-Response Time-> <-Total- SYSID/ APPLID Group Count Total Susp Disp CPU DISP ZIP <--CPU-- Time Resp Time Time Time Wait CPU Secs Pct ---- ------- -------- ----- ----- ----- ----- ----- ----- --- ---- --- 10/20/20 17:56:00 - 17:57:00 V24A C24ASTND Totals 688 0.016 0.006 0.010 0.007 0.001 0 4.9 8.1 ------------------------------------------------------------------------ 17:57:00 - 17:58:00 V24A C24ASTND Totals 1192 0.013 0.005 0.008 0.006 0.001 0 6.6 11. InFlight ------------------------------------------------------------------------ 17:58:00 - 17:59:00 V24A C24ASTND Totals 1189 0.013 0.005 0.008 0.006 0.001 0 6.8 11. InFlight 11 119.9 119.7 0.221 0.064 0.024 0 0.7 1.1
CPU by transaction is found on ZOSCIX4, and in this case it is obvious that this workload does not utilize the zIIP processor. The CPU time is per transaction in CPU seconds. In this case the average for ALL the transactions that completed was .007 CPU seconds or 7 milliseconds per transaction.
Report: ZOSCIX4 z/OS CICS Tranaction IO Wait Analy Monitor initialized: 10/20/20 at 17:56:00 on Z15S seria ------------------------------------------------------- Time/ Transaction <-Response Time->SYSID Total Susp Disp CPU PC ZIP APPLID ID Count Resp Time Time Time Load CPU -------- ---- ----- ----- ----- ----- ----- ----- --- 10/20/20 17:56:00 - 17:57:00 V24A C24ASTND Total 688 0.016 0.006 0.010 0.007 0 0 CSAC 55 0.003 0.001 0.002 0.001 0 0 STRH 31 0.059 0.018 0.041 0.037 0 0 STR1 29 0.007 0.003 0.004 0.003 0 0 STR2 28 0.007 0.000 0.007 0.005 0 0 STR3 22 0.013 0.001 0.012 0.010 0 0 STR4 23 0.014 0.001 0.014 0.011 0 0 STR5 21 0.028 0.005 0.023 0.017 0 0 STR6 24 0.021 0.002 0.019 0.017 0 0 STR7 22 0.025 0.005 0.020 0.017 0 0 STR8 25 0.031 0.007 0.023 0.021 0 0 STR9 35 0.047 0.017 0.030 0.026 0 0 ZIPC 113 0.006 0.004 0.001 0.001 0 0 ZIPU 118 0.004 0.002 0.002 0.001 0 0 ZPHG 71 0.014 0.007 0.007 0.002 0 0 ZPHP 71 0.022 0.016 0.006 0.002 0 0
Storage information is provided by the statistics type 29 record at the interval chosen. (type 2 for VSE). This data is exposed in great detail on the ZOSCIXS1 Dynamic Storage Area Report as shown in the following partial report:
Report: ZOSCIXS1 z/OS CICS Dynamic Storage Area Report Velocity Monitor initialized: 10/20/20 at 17:56:00 on Z15S serial 0E0F78 ---------------------------------------------------------------------------- SYSID JobName Pool <-----DSA (mb)----> Pool <-Storage in Mbytes--> Lrgst Locn Limit Total HWM Name Size HWM Cush Free Free ---- --------- ---- ----- ------ ----- ---- ----- ----- ----- ---- ---- - 10/20/20 17:56:00 - 17:57:00 V24A C24ASTND <16M 5.000 1.000 1.000 CDSA 0.3 0.3 0.1 0.1 70K UDSA 0.3 0.3 0.1 0.3 262K SDSA 0.3 0.3 0.1 0.2 258K RDSA 0.3 0.3 0.1 0.1 78K <2GB 400.0 63.00 63.0 ECDSA 10.0 10.0 0.1 2.5 975K 1 EUDSA 2.0 2.0 0 2.0 1.0M ESDSA 17.0 17.0 0.1 1.0 1.0M ERDSA 33.0 33.0 0.3 0.3 217K ETDSA 1.0 1.0 0.1 1.0 1.0M >2GB 10240 1024 1024 GCDSA 0.0 0.0 0.0 0.0 1003
Items that you may wish to set alerts on is the "short on storage", "sos", and possbly the high water mark (HWM) as well.
You may wish to compare the CICS view of storage with the z/OS "job view" on the ZOSJSTR report.
Report: ZOSJSTR z/OS Job Storage Report Monitor initialized: 10/20/20 at 17:56:00 on Z15S serial 0E0F78 ------------------------------------------------------------------- SYSID <------JOB----------> CPU <-LSQA-->16M <16M >16M Regn Nbr Pct <16M >16M ---- -------- -------- --- --- ---- ---- ---- ---- ----- --- ---- 10/20/20 17:56:00 - 17:57:00 V24A Totals . 19.5 10M 2.2G 16M 1.8G 309M 74G 4.3M CATALOG CATALOG 1 0 561K 16M 16K 1.1M 8.4M 2.0G 8168 C24ASTND STC04414 1 13.0 393K 17M 5.6M 439M 8.4M 2.0G 1.2M GPMSERVE STC03424 1 0.1 197K 15M 127K 21M 8.4M 2.0G 262K IXGLOGR IXGLOGR 1 0.0 213K 16M 57K 9.0M 8.4M 2.0G 8168 IZUSVR1 STC03413 3 0.8 332K 132M 197K 233M 8.4M 2.0G 8168 JESXCF JESXCF 1 0.0 164K 13M 53K 1.1M 8.4M 2.0G 8168 JES2 JES2 1 0.3 311K 14M 631K 867M 8.4M 2.0G 8168 JES2MON JES2MON 1 0.3 143K 12M 4096 2.5M 0 0 0 ...
Storage at a system level can be evaluated on a finer granularity by looking at transaction type requirements. There are different transaction uses of storage, below the line, above the line, UDSA, CDSA, and combination. It is important to understand that a transaction only uses storage for a short time. There are two parts of that; there is the storage requirements of the transaction, but then the transaction holds that storage for approcimately the length of the transaction. Thus whe response time gets worse, the "storage impact" can increase greatly.
This is the concept of storage residency or storage impact based on transactions with finer granularity. The following ZOSCIX5 report shows the average storage per transaction, and then shows the storage impact. Using "Storage impact" by transaction ID, it becomes easy to determine where to spend time reducing storage.
Report: ZOSCIX5 z/OS CICS Tranaction Storage Analysis ZMAP 5.1.1 ----------------------------------------------------------------------------------------- Time/ Transaction<-Shared-> SYSID <-GT 16MB-> <-GT 16MB--> <---per----> APPLID ID Count USDA CDSA EUDSA ECDS UDSA CDSA EUDSA ECDSA Gets Frees Xact TranID -------- ---- ----- ---- ---- ----- ----- ---- ---- ----- ----- ---- ----- ----- ------ 17:56:00 - 17:57:00 V24A C24ASTND Totals 688 0 630K 0 62618 0 0 2.8 0.8 0 0 180 123928 CSAC 55 0 0 0 4015 0 0 0 1.0 0 0 0 11 STRH 31 0 2.5M 0 0 0 0 4.0 0 0 0 2469 76549 STR1 29 0 227K 0 0 0 0 4.0 0 0 0 27 771 STR2 28 0 412K 0 0 0 0 4.0 0 0 0 50 1413 STR3 22 0 676K 0 0 0 0 4.0 0 0 0 142 3121 STR4 23 0 805K 0 0 0 0 4.0 0 0 0 192 4421 STR5 21 0 1.4M 0 0 0 0 4.0 0 0 0 654 13742 STR6 24 0 1.1M 0 0 0 0 4.0 0 0 0 405 9712 STR7 22 0 1.2M 0 0 0 0 4.0 0 0 0 487 10724 STR8 25 0 1.4M 0 0 0 0 4.0 0 0 0 723 18070 STR9 35 0 1.8M 0 0 0 0 4.0 0 0 0 1416 49558
By picking the major storage users based on the last column, improving response time and/or reducing storage of the transaction IDs would be where to start.
If the CICS region is paged out, performance will degrade. System paging can be found on the ZOSPAGE report. In this case there is zero paging I/O at the system level.
Report: ZOSPAGE Z/OS PAGE Detail Report Velocity Sof Monitor initialized: 10/20/20 at 17:56:00 on Z15S serial 0E0F78 ------------------------------------------------------------------------------- Time/ Page Volume Devc <-----Page Slots------------> Pct ASMSYSID Type Serial Nmbr Total MaxU MinU Avg Bad Used Used page I/O ------ ----- ------ ---- ----- ----- ----- ----- ----- ---- ---- ----- ----- 10/20/20 17:56:00 - 17:57:00 V24A Totals: 1025K 152K 152K 152K 0 14.9 0 0 0 PLPA 24ASPG 0246 234K 16366 16366 16366 0 6.99 0 0 0 Commn 24ACAT 0241 9719 2489 2489 2489 0 25.6 0 0 0 Local 24ASPG 0246 781K 134K 134K 134K 0 17.1 0 0 0
Paging at the specific region level can be determined on the ZOSJSTR report showing storage of each job as well as swap and paging. The following shows all page/swap rates associated region by retion:
Report: ZOSJSTR z/OS Job Storage Report ZMAP 5.1.1 11/01/20 Page ------------------------------------------------------------------------------- SYSID <------JOB----------> CPU> . Page <--Page In-> Hper SwapName JobID Step - . Stls CSA Hper LPA Page Rate Nbr Pctm . Spce Out In OUT In Out ---- -------- -------- --- ---- . ---- --- ---- --- ---- ---- --- --- --- --- 10/20/20 . 17:56:00 - 17:57:00 . V24A . Totals . 19.50 . 0 0 0 0 0 0 0 0 0 0 CATALOG CATALOG 1 00 . 0 0 0 0 0 0 0 0 0 0 C24ASTND STC04414 1 13.00 . 0 0 0 0 0 0 0 0 0 0 GPMSERVE STC03424 1 0.10 . 0 0 0 0 0 0 0 0 0 0 IXGLOGR IXGLOGR 1 0.00 . 0 0 0 0 0 0 0 0 0 0 IZUSVR1 STC03413 3 0.80 . 0 0 0 0 0 0 0 0 0 0 JESXCF JESXCF 1 0.00 . 0 0 0 0 0 0 0 0 0 0 JES2 JES2 1 0.30 . 0 0 0 0 0 0 0 0 0 0
At this point, if there were paging, then evaluation of storage requirements and how the storage is used would be the next step.
Report: ZOSCIX4 z/OS CICS Tranaction IO Wait Analysis Monitor initial10/20/20 at 17:56:0 ------------------------------------------------------------------- Time/ Trans <-Response Time-> .Suspend - I/O Wait (milliseconds) SYSID Total Susp Disp . Shr APPLID ID Resp Time Time . Trm Jrn TS File MRO TD TS RLS -------- ---- ----- ----- ----- . --- --- --- ---- --- --- --- --- 10/20/20 . 17:56:00 - 17:57:00 . V24A . C24ASTND Total 0.016 0.006 0.010 . 0.0 0 0 0.0 0 0 0 0 CSAC 0.003 0.001 0.002 . 0.0 0 0 0 0 0 0 0 STRH 0.059 0.018 0.041 . 0 0 0 0 0 0 0 0 STR1 0.007 0.003 0.004 . 0 0 0 0 0 0 0 0 STR2 0.007 0.000 0.007 . 0 0 0 0 0 0 0 0 STR3 0.013 0.001 0.012 . 0 0 0 0 0 0 0 0 STR4 0.014 0.001 0.014 . 0 0 0 0 0 0 0 0 STR5 0.028 0.005 0.023 . 0 0 0 0 0 0 0 0 STR6 0.021 0.002 0.019 . 0 0 0 0 0 0 0 0 STR7 0.025 0.005 0.020 . 0 0 0 0 0 0 0 0 STR8 0.031 0.007 0.023 . 0 0 0 0 0 0 0 0 STR9 0.047 0.017 0.030 . 0 0 0 0 0 0 0 0 ZIPC 0.006 0.004 0.001 . 0 0 0 0.1 0 0 0 0 ZIPU 0.004 0.002 0.002 . 0.1 0 0 0 0 0 0 0 ZPHG 0.014 0.007 0.007 . 0 0 0 0.2 0 0 0 0 ZPHP 0.022 0.016 0.006 . 0 0 0 0 0 0 0 0Max Task
Maximum task limit (MXT) and associated information such as current active tasks is reported on ZOSCIX1 with the CICS configuration items.
Report: ZOSCICS1 z/O Veloporate ZMAP 5.1.1 10/24/ Monitor initialized: 10/erial 0E0F78 ---------------------------------------------------------------------- SYSID <--CICS Program-> <----Location---> <----Task Statistics---> APPLID JobName LPARName VMID Total MXT Actv Qued PctM ---- -------- -------- - -------- -------- ----- --- ---- ---- ---- 17:59:00 - 18:00:00 V24A C24ASTND C24ASTND 0 VSIVM5 1197 75 19 0 25.3 V24B C24BSTND C24BSTND 0 VSIVM5 25449 150 10 0 6.7 V61B CICSJA60 CICSJA60 0 VSIVM5 ZVSE61B 364 40 2 0 5.0 CICSJB60 CICSJB60 0 VSIVM5 ZVSE61B 341 50 2 0 4.0It is likely an alert should be set on the percent of max task, so that if max task is approached, there will be an operation alert.
Future topics
Terminal data, errors
DASD Usage
Temporary Storage STID 48
TCPIP Statistics STID 48