I have a production transaction issue, need suggestions for the fix :
We have transaction QQQQ which gets triggered from MQ as soon as the MQ receives a new message in it.
The transaction QQQQ reads the message using the below options :
COMPUTE MQGMO-OPTIONS = MQGMO-ACCEPT-TRUNCATED-MSG
+ MQGMO-SYNCPOINT
+ MQGMO-WAIT
CALL MQGET USING MQ-CONNECTION-HANDLE,
MQ-OBJECT-HANDLE,
MQMD,
MQGMO,
MQ-MSG-LENGTH,
WS--PACKETS,
MQ-DATALEN,
MQ-COMPLETION-CODE,
MQ-REASON-CODE.
+ MQGMO-SYNCPOINT
+ MQGMO-WAIT
CALL MQGET USING MQ-CONNECTION-HANDLE,
MQ-OBJECT-HANDLE,
MQMD,
MQGMO,
MQ-MSG-LENGTH,
WS--PACKETS,
MQ-DATALEN,
MQ-COMPLETION-CODE,
MQ-REASON-CODE.
So here it clearly states that the messahe will get deleted when teh tran QQQQ executes SYNCPOINT .
After it reads the messages it do some processing of the data and it will update in the production VSAM files .
Now here comes the issue :
The transaction is looping quite often nowadays consuming high CPU . when analysed what we found is :
The transaction will be calling a program ' AAAAA' whcih updates the VSAM files, part of the data is sent by the QQQQ by placing some of the data into the Transient Storage queue and other data is sent by reference to the called program 'AAAAA' .
At times the tran is going into infinite loop as it was not able to update the VSAM files thru AAAAA .
Any update the tran do has to be done to all the files or no update should be done .So if concurrent update happens , then SYNCROLLBACK issued to undo the changes to the files.
But at atimes the SYNCROLLBACK is not functioning properly and in one of the file the record update is not undone and its getting commited . Since Rollback issued the tran again processes the message , but unable to update the files since one of the file alreadys has got a record updated in previous try , which results in duplicate record insertion .
So the AAAAA program returns control back to tran QQQQ, the code at tran QQQQ is define din such a way that , it will try for ten times to update files in case of concurrent update or duplicate record found :
PERFORM C150-UPDATE-REC.
* This paragraph will call AAAAA which will update the MQ message details in the production VSAM files.
* Concurrent update flag is set when
DB-ACCESS-RETURN-STATUS = CONCURRENT-UPDATE
DB-ACCESS-RETURN-STATUS = DUPLICATE-RECORD-FOUND
***File status returned by the AAAAA program
IF WS-CONCURRENT-UPDATE-FLAG = YES
***If concurrent update try ten times to write the data into the files .
PERFORM D115-UPDATE-FILES
VARYING WS-TEMP
FROM +1 BY +1
UNTIL WS-TEMP> WS-NUM-RETRIES
OR WS-CONCURRENT-UPDATE-FLAG = NOPE
ELSE
NEXT SENTENCE.
*** After ten tries it goes to the below IF LOOP
IF WS-CONCURRENT-UPDATE-FLAG = YES
***If the lag is still set , then it will log the error message in another file and writes the message into error queue
MOVE DB-ACCESS-RETURN-STATUS
TO TRM-RETURN-STATUS
PERFORM E900-HANDLE-ERROR
........
..........
PERFORM LOG-ERR-MESSAGE
ELSE
NEXT SENTENCE.
*** After that the tran is executing the SYNCPOINT command, it means that after ten tries , the message has to be removed from the message queue , but in our case the message is not removed from the queue . The transaction is defined in such a way that it will run until it completes the processing of messages in the queue. Since the tran is reading the same message , its looping infinetely reading the same message causing high CPU utilization .
ONE REPLY IS ONE UNIT OF WORK, SO COMMIT UPDATES
EXEC CICS SYNCPOINT
END-EXEC..
* This paragraph will call AAAAA which will update the MQ message details in the production VSAM files.
* Concurrent update flag is set when
DB-ACCESS-RETURN-STATUS = CONCURRENT-UPDATE
DB-ACCESS-RETURN-STATUS = DUPLICATE-RECORD-FOUND
***File status returned by the AAAAA program
IF WS-CONCURRENT-UPDATE-FLAG = YES
***If concurrent update try ten times to write the data into the files .
PERFORM D115-UPDATE-FILES
VARYING WS-TEMP
FROM +1 BY +1
UNTIL WS-TEMP> WS-NUM-RETRIES
OR WS-CONCURRENT-UPDATE-FLAG = NOPE
ELSE
NEXT SENTENCE.
*** After ten tries it goes to the below IF LOOP
IF WS-CONCURRENT-UPDATE-FLAG = YES
***If the lag is still set , then it will log the error message in another file and writes the message into error queue
MOVE DB-ACCESS-RETURN-STATUS
TO TRM-RETURN-STATUS
PERFORM E900-HANDLE-ERROR
........
..........
PERFORM LOG-ERR-MESSAGE
ELSE
NEXT SENTENCE.
*** After that the tran is executing the SYNCPOINT command, it means that after ten tries , the message has to be removed from the message queue , but in our case the message is not removed from the queue . The transaction is defined in such a way that it will run until it completes the processing of messages in the queue. Since the tran is reading the same message , its looping infinetely reading the same message causing high CPU utilization .
ONE REPLY IS ONE UNIT OF WORK, SO COMMIT UPDATES
EXEC CICS SYNCPOINT
END-EXEC..
here i have few queries :
1. How a record written into the file is not undone when ROLLBACK command is issued .
2. is their a chance to write same record twice , because the problematic file is getting data from Transient storage queue .
3. Why a SYNCPOINT is not able to remove the message from the MQ . IS it possible ?
I do came up with aplan to fix this , but i feel its not optimal solution, so can u let me know what i can do to fix the issue . Please let me know in case you need any other details regarding this
Thanks for your time !!
Nara