Jonathan Patchell
Jonathan Patchell

Reputation: 1

Ignite update query takes significant time after EVT_NODE_SEGMENTED event

In a 4 member cluster, one of the members has the networking interface taken down as part of a network configuration change, for short period of time (5-10s). At this time, all 3 other nodes in the cluster get an EVT_NODE_SEGMENTED event. In response to this event, before shutting down the database, a simple query is performed to update some state in the database. This query takes several minutes (almost 4) and the following is seen in the ignite.log (similar on all 3 nodes with the EVT_NODE_SEGMENTED event):

[2025-02-05T23:18:08,603][WARN ][sys-#65511][diagnostic] First 10 long running transactions [total=1]
[2025-02-05T23:18:08,603][WARN ][sys-#65511][diagnostic] >>> Transaction [startTime=23:14:33,304, curTime=23:18:08,601, systemTime=86604701, userTime=10596, tx=GridNearTxLocal [mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping [entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey [key=KeyCacheObjectImpl [part=106, val=69f21157-4768-4970-9d7f-fea5938d5c80, hasValBytes=false], cacheId=-2102069554], val=TxEntryValueHolder [val=null, op=TRANSFORM], prevVal=TxEntryValueHolder [val=null, op=NOOP], oldVal=TxEntryValueHolder [val=null, op=NOOP], entryProcessorsCol=LinkedList [T2 {com.thales.lnh.database.io.processor.UpdateNeedToRestartService@5bfb15b3=Object[] [true]}], ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=true, entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry [super=GridCacheMapEntry [key=KeyCacheObjectImpl [part=106, val=69f21157-4768-4970-9d7f-fea5938d5c80, hasValBytes=false], val=null, ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0, dataCenterId=0], hash=543750658, extras=null, flags=0]]], prepared=0, locked=false, nodeId=1c54954f-ced1-456c-a93b-bb588f8263d7, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion [topVer=349780775, order=1738304364281, nodeOrder=1, dataCenterId=0]]], explicitLock=false, queryUpdate=false, dhtVer=null, last=false, nearEntries=0, clientFirst=false, node=1c54954f-ced1-456c-a93b-bb588f8263d7]], nearLocallyMapped=false, colocatedLocallyMapped=false, needCheckBackup=null, hasRemoteLocks=false, trackTimeout=false, systemTime=0, systemStartTime=25099036079917314, prepareStartTime=25099036079917365, prepareTime=0, commitOrRollbackStartTime=0, commitOrRollbackTime=0, lb=null, mvccOp=false, qryId=-1, crdVer=0, thread=OrderedWorkerThread-0, mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping [entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey [key=KeyCacheObjectImpl [part=106, val=69f21157-4768-4970-9d7f-fea5938d5c80, hasValBytes=false], cacheId=-2102069554], val=TxEntryValueHolder [val=null, op=TRANSFORM], prevVal=TxEntryValueHolder [val=null, op=NOOP], oldVal=TxEntryValueHolder [val=null, op=NOOP], entryProcessorsCol=LinkedList [T2 {com.thales.lnh.database.io.processor.UpdateNeedToRestartService@5bfb15b3=Object[] [true]}], ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=true, entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry [super=GridCacheMapEntry [key=KeyCacheObjectImpl [part=106, val=69f21157-4768-4970-9d7f-fea5938d5c80, hasValBytes=false], val=null, ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0, dataCenterId=0], hash=543750658, extras=null, flags=0]]], prepared=0, locked=false, nodeId=1c54954f-ced1-456c-a93b-bb588f8263d7, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion [topVer=349780775, order=1738304364281, nodeOrder=1, dataCenterId=0]]], explicitLock=false, queryUpdate=false, dhtVer=null, last=false, nearEntries=0, clientFirst=false, node=1c54954f-ced1-456c-a93b-bb588f8263d7]], super=GridDhtTxLocalAdapter [nearOnOriginatingNode=false, span=o.a.i.i.processors.tracing.NoopSpan@6ff02735, nearNodes=KeySetView [], dhtNodes=KeySetView [], explicitLock=false, super=IgniteTxLocalAdapter [completedBase=null, sndTransformedVals=false, depEnabled=false, txState=IgniteTxImplicitSingleStateImpl [init=true, recovery=false, useMvccCaching=false], super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=349780775, order=1738304364281, nodeOrder=1, dataCenterId=0], writeVer=null, implicit=true, loc=true, threadId=133, startTime=1738710873304, nodeId=b7416ef6-ca24-4efd-9558-c1d3ffa0861c, isolation=READ_COMMITTED, concurrency=OPTIMISTIC, timeout=0, sysInvalidate=false, sys=false, plc=2, commitVer=GridCacheVersion [topVer=349780775, order=1738304364281, nodeOrder=1, dataCenterId=0], finalizing=NONE, invalidParts=null, state=PREPARING, timedOut=false, topVer=AffinityTopologyVersion [topVer=4, minorTopVer=2], mvccSnapshot=null, skipCompletedVers=false, parentTx=null, duration=86615297ms, onePhaseCommit=false], size=1]]]]

The code following query is never reached.

Is there any reason/explanation for why this query is taking so long and the code after the query isn't executed? Are there any insights as to what is happening during this time? Is the EVT_NODE_SEGMENTED event on all other nodes expected for this scenario?

I'm able to replicate the EVT_NODE_SEGMENTED but I'm not able to see the database query take a long time.

Upvotes: 0

Views: 14

Answers (0)

Related Questions