Reputation: 41
My ReplayMerge
gets stuck in state ATTEMPT_LIVE_JOIN
, then times out due to no progress. It adds the live destination with no issues (I see the corresponding subscription appear in aeron-stat and the onImageAvailable
callback is invoked). Eventually it catches up fully but doesn't transition to the next state.
After an investigation, I found that the problematic check is in function shouldStopAndRemoveReplay
where image.activeTransportCount() >= 2
is false because image.activeTransportCount()
stays at 1
. If it weren't for that check the ReplayMerge
would succeed.
Here are my ReplayMerge
parameters:
replayChannel = "aeron:udp"
replayDestination = "aeron:udp?endpoint=localhost:0"
liveDestination = "aeron:udp?endpoint=localhost:0|control=localhost:12345"
I've tried both the Java client and the C++ client. What am I missing?
EDIT: aeron-stat
on the client side gives looks like this:
42: 1 - rcv-local-sockaddr: 41 <some IP address>:54709
43: 452,985,472 - sub-pos: 24 -106708072 3000 aeron:udp?control-mode=manual @0
44: 452,985,472 - rcv-hwm: 28 -106708072 3000 aeron:udp?control-mode=manual
45: 452,985,472 - rcv-pos: 28 -106708072 3000 aeron:udp?control-mode=manual
46: 1 - rcv-local-sockaddr: 41 0.0.0.0:39238
47: 452,971,520 - sub-pos: 24 -106708098 3000 aeron:udp?control-mode=manual @452971520
48: 452,985,472 - rcv-hwm: 89 -106708098 3000 aeron:udp?control-mode=manual
49: 452,971,520 - rcv-pos: 89 -106708098 3000 aeron:udp?control-mode=manual
The first driver subscription is from the replayDestination
. All the numbers go up as you would expect, like a normal replay.
The second one is from the added liveDestination
. Once created it doesn't catch up at all, contrary to my initial assessment above. sub-pos
and rcv-pos
are stuck at the initial position of 452971520, but the rcv-hwm
goes up together with the position of the replay subscription. Doesn't this indicate that data is being received but not read on the live destination subscription?
I noticed that the ReplayMerge#image
is simply defined as
image = subscription.imageBySessionId((int)replaySessionId);
So I tried to instead poll the Subscription
I passed to the ReplayMerge
constructor so that both images would get polled internally. That did not help.
Upvotes: 1
Views: 463
Reputation: 41
I fixed my issue (encountered with this code) by ensuring the replayChannel
passed to the ReplayMerge
is session ID-specific.
File ReplayMergeTest.java in the aeron codebase does it with
private final String publicationChannel = new ChannelUriStringBuilder()
// ...
.tags("1," + PUBLICATION_TAG)
// ...
;
private final String replayChannel = new ChannelUriStringBuilder()
.media(CommonContext.UDP_MEDIA)
.isSessionIdTagged(true)
.sessionId(PUBLICATION_TAG)
.build();
so that the session ID of the replay channel is set to be that of the Publication
associated to tag PUBLICATION_TAG
. This works as well in the case where the publishing media driver and subscribing media driver are distinct but you still have to somehow communicate the Publication tag to the subscriber which might be inconvenient.
So the solution I'll be going for is to take the session ID from the recording descriptor of the recording to be replayed, at the earlier point where I discover recordings with AeronArchive#listRecordingsForUri
(or similar).
This gist shows a working ReplayMerge
across two media drivers.
Upvotes: 2