How to resolve this channel issue from WMQ?

wing2ofsky picture wing2ofsky · Jul 23, 2012 · Viewed 12.8k times · Source

Below is the related part from a QMGR log file about a WMQ channel issue:

-------------------------------------------------------------------------------
2012-7-23 10:35:25 - Process(340.1) User(MUSR_MQADMIN) Program(runmqchl.exe)
AMQ9206:  Error sending data to host 86.0.223.5(1602) 。

EXPLANATION:
An error occurred sending data over TCP/IP to 86.0.223.5(1602). This may be due to
a communications failure.
ACTION:
The return code from the TCP/IP(send) call was 10054 X('2746'). Record these
values and tell your systems administrator.  
----- amqccita.c : 2612 -------------------------------------------------------
2012-7-23 10:35:25 - Process(340.1) User(MUSR_MQADMIN) Program(runmqchl.exe)
AMQ9999: Channel program ended abnormally. 

EXPLANATION: 
Channel program 'CZWJNS.CZWJCZ' ended abnormally. 
ACTION: 
Look at previous error messages for channel program 'CZWJNS.CZWJCZ' in the 
error files to determine the cause of the failure.  
----- amqrccca.c : 834 --------------------------------------------------------
2012-7-23 10:35:35 - Process(3616.1) User(MUSR_MQADMIN) Program(runmqchl.exe)
AMQ9002: Channel “CZWJNS.CZWJCZ' is starting。

EXPLANATION:
Channel “CZWJNS.CZWJCZ' is starting。
ACTION:
None。 
-------------------------------------------------------------------------------
2012-7-23 10:40:35 - Process(3616.1) User(MUSR_MQADMIN) Program(runmqchl.exe)
AMQ9206:  Error sending data to host 86.0.223.5(1602) 。

EXPLANATION:
An error occurred sending data over TCP/IP to 86.0.223.5(1602). This may be due to
a communications failure.
ACTION:
The return code from the TCP/IP(send) call was 10054 X('2746'). Record these
values and tell your systems administrator.   
----- amqccita.c : 2612 -------------------------------------------------------
2012-7-23 10:40:35 - Process(3616.1) User(MUSR_MQADMIN) Program(runmqchl.exe)
AMQ9999: Channel program ended abnormally. 

EXPLANATION: 
Channel program 'CZWJNS.CZWJCZ' ended abnormally. 
ACTION: 
Look at previous error messages for channel program 'CZWJNS.CZWJCZ' in the 
error files to determine the cause of the failure.  
----- amqrccca.c : 834 --------------------------------------------------------
2012-7-23 10:40:45 - Process(4848.1) User(MUSR_MQADMIN) Program(runmqchl.exe)
AMQ9002: Channel “CZWJNS.CZWJCZ' is starting。

EXPLANATION:
Channel “CZWJNS.CZWJCZ' is starting。
ACTION:
None。
-------------------------------------------------------------------------------

Right now, the situation is that the target channel (CZWJNS.CZWJCZ) can finally run, but only after a few retry attempts. It keeps happening often. All the messages can be delivered to the target queue in the remote QMGR host successfully. However, they're always delayed due to the multiple retry attempts.

I've searched through the internet for the return code 10054 and it means the connection has been reset by the peer.

My WMQ version is 6.0.10 on Windows 2003.

Answer

T.Rob picture T.Rob · Jul 23, 2012

The "Connection reset by peer" means that something between this node and the other node closed the connection. The cause can range from dodgy/noisy network, to firewall timing out, to channel exits that refuse the connection, or many other causes.

The key to diagnosis in these cases is to narrow down the cause. This requires looking at the error logs on both QMgrs (or the client and QMgr) for the same event. In the case of a channel exit, a look at the channel definitions on both sides reveals whether such an exit is in place but if it is then you need to look at the exit's configuration and logs as well.

If the problem is in the network, the error logs from both QMgrs will show similar errors. However if one QMgr closed the connection intentionally then you will see that in its log files.