zkarj
zkarj

Reputation: 689

When should I reset my MQ channels?

I've been studying the vagaries of channel statuses, how they get to those states and what to do to get them stopped or started. I've got a pretty solid understanding now, but a colleague brought up the topic of channel resets.

I've done them occasionally when I couldn't explain what was going on, but now I understand things a bit better I'm not sure his advice to "always reset" when stopping troublesome channels is the right advice.

Searching for info online, it's clear that when recreating channels it is obvious a reset would be needed but in the case if stuff just breaking – whether a queue manager is unexpectedly dropped or the network breaks or stuff like that – is a reset a good idea in general or should I only bother if I see sequence errors or it otherwise refuses to start when I know it should?

Upvotes: 2

Views: 8608

Answers (2)

glennb
glennb

Reputation: 96

FYI, if you are resetting from the sending side of the channel, its OK to set the sequence number to 1. The receiving side will then also go back to 1. QED :-)

If you are resetting from the receiving side of the channel, you must use the sequence number that the sender was expecting.

These numbers are in the queue manager error logs on both sides.

If the channel is in RETRY state, it will try to use the new sequence numbers when it does the next retry. This could be up to 20 minutes away if you are using the default retry attributes on the sender channel. A simple way to bump this is to STOP the channel and then START it again straight away.

HTH, G.

Upvotes: 4

T.Rob
T.Rob

Reputation: 31852

Channels get sequence errors for a few reasons:

  1. The local and remote MCAs got out of sync on a batch. Usually the remote MCA committed the batch but the local one did not. If you know the remote side delivered the batch, issue a RESOLVE ACTION(COMMIT) on the channel, otherwise issue RESOLVE ACTION(COMMIT). After resolving, issue RESET.
  2. The channel points to a new QMgr. Perhaps after failover at the DNS, circuit or firewall NAT, a different QMgr of the same name is now attached to the channel. These should be well known because the failover (hopefully) doesn't happen without some alerts going off.
  3. The contents of the channel sync queue are in error. Sometimes the QMgr can cause this but those issues are resolved (so far as I know) in recent versions. Sometimes people accidentally mess up the sync queue, usually by browsing it with a lock while the channels are trying to use it. This is a little harder to resolve and may require clearing the sync queue but check with IBM Support first.

When the channel is out of sync because of a known exception like failover, go ahead and reset it. Otherwise, you'd be well advised to find out why it's out of sync. You might reset it just to get it up and running, but hopefully not until you've saved off the <QMGR>/errors/AMQERR*.LOG files and any FDCs so you can diagnose the cause.

Upvotes: 2

Related Questions