RSSAH
RSSAH

Reputation: 153

Apache Ignite Persistence Issues/Warnings

We are running a write-intensive load test on Apache Ignite. We are also doing reads. We are getting the following warning messages from sys-stripe threads quite often after a few hours of testing.

DB Config -> RAM - 8GB, CPU cores - 64, Persistence - ON, Heap - 2GB, Durable memory Off-heap - 2GB, XX:MaxDirectMemorySize - 1GB, WAL Archiving - Off, checkpointBufferSize - 1GB, walSegmentSize - 256mb

We initialized 70 threads(which query) in App Server (Client) and the client has 64 cores CPU with 2GB maxheap configuration.

[WARNING][grid-timeout-worker-#135][G] >>> Possible starvation in striped pool.
    Thread name: sys-stripe-1-#2
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearGetRequest [futId=49990d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, miniId=59990d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, ver=null, keyMap=null, flags=1, topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], subjId=2ab5710f-6568-4940-b3cc-ce756a634f4e, taskNameHash=0, createTtl=-1, accessTtl=-1, txLbl=null, mvccSnapshot=null]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearAtomicSingleUpdateFilterRequest [filter=[o.a.i.i.processors.cache.CacheEntrySerializablePredicate@653d7f85], parent=GridNearAtomicSingleUpdateRequest [key=KeyCacheObjectImpl [part=129, val=null, hasValBytes=true], parent=GridNearAtomicAbstractSingleUpdateRequest [nodeId=null, futId=29605669, topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], parent=GridNearAtomicAbstractUpdateRequest [res=null, flags=needRes|keepBinary]]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearTxFinishRequest [miniId=1, mvccSnapshot=null, super=GridDistributedTxFinishRequest [topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], futId=97a90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, threadId=1251, commitVer=null, invalidate=false, commit=false, baseVer=null, txSize=0, sys=true, plc=2, subjId=2ab5710f-6568-4940-b3cc-ce756a634f4e, taskNameHash=0, flags=32, syncMode=FULL_SYNC, txState=null, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=200084669, order=1588652521556, nodeOrder=3], committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheIdMessage [cacheId=0, super=GridCacheMessage [msgId=59316508, depInfo=null, lastAffChangedTopVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], err=null, skipPrepare=false]]]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearTxPrepareRequest [futId=cca90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, miniId=1, topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], subjId=2ab5710f-6568-4940-b3cc-ce756a634f4e, taskNameHash=0, txLbl=null, flags=, super=GridDistributedTxPrepareRequest [threadId=2528, concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, writeVer=GridCacheVersion [topVer=200084669, order=1588652521620, nodeOrder=3], timeout=0, reads=ArrayList [], writes=ArrayList [IgniteTxEntry [txKey=null, val=TxEntryValueHolder [val=CacheObjectImpl [val=null, hasValBytes=true], op=UPDATE], prevVal=TxEntryValueHolder [val=null, op=NOOP], oldVal=TxEntryValueHolder [val=null, op=NOOP], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=null, filtersPassed=false, filtersSet=false, entry=null, prepared=0, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null, xidVer=null]], dhtVers=null, txSize=0, plc=2, txState=null, flags=onePhase|last|sys, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=200084669, order=1588652521620, nodeOrder=3], committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheIdMessage [cacheId=0, super=GridCacheMessage [msgId=59316562, depInfo=null, lastAffChangedTopVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], err=null, skipPrepare=false]]]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearGetRequest [futId=36c90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, miniId=46c90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, ver=null, keyMap=null, flags=1, topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], subjId=2ab5710f-6568-4940-b3cc-ce756a634f4e, taskNameHash=0, createTtl=-1, accessTtl=-1, txLbl=null, mvccSnapshot=null]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearTxPrepareRequest [futId=39c90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, miniId=1, topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], subjId=2ab5710f-6568-4940-b3cc-ce756a634f4e, taskNameHash=0, txLbl=null, flags=, super=GridDistributedTxPrepareRequest [threadId=2141, concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, writeVer=GridCacheVersion [topVer=200084669, order=1588652522004, nodeOrder=3], timeout=0, reads=ArrayList [], writes=ArrayList [IgniteTxEntry [txKey=null, val=TxEntryValueHolder [val=CacheObjectImpl [val=null, hasValBytes=true], op=UPDATE], prevVal=TxEntryValueHolder [val=null, op=NOOP], oldVal=TxEntryValueHolder [val=null, op=NOOP], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=null, filtersPassed=false, filtersSet=false, entry=null, prepared=0, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null, xidVer=null]], dhtVers=null, txSize=0, plc=2, txState=null, flags=onePhase|last|sys, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=200084669, order=1588652522004, nodeOrder=3], committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheIdMessage [cacheId=0, super=GridCacheMessage [msgId=59316868, depInfo=null, lastAffChangedTopVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], err=null, skipPrepare=false]]]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearGetRequest [futId=8cc90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, miniId=9cc90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, ver=null, keyMap=null, flags=1, topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], subjId=2ab5710f-6568-4940-b3cc-ce756a634f4e, taskNameHash=0, createTtl=-1, accessTtl=-1, txLbl=null, mvccSnapshot=null]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearGetRequest [futId=4cd90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, miniId=5cd90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, ver=null, keyMap=null, flags=1, topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], subjId=2ab5710f-6568-4940-b3cc-ce756a634f4e, taskNameHash=0, createTtl=-1, accessTtl=-1, txLbl=null, mvccSnapshot=null]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearGetRequest [futId=28e90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, miniId=38e90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, ver=null, keyMap=null, flags=1, topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], subjId=2ab5710f-6568-4940-b3cc-ce756a634f4e, taskNameHash=0, createTtl=-1, accessTtl=-1, txLbl=null, mvccSnapshot=null]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearTxPrepareRequest [futId=d9e90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, miniId=1, topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], subjId=2ab5710f-6568-4940-b3cc-ce756a634f4e, taskNameHash=0, txLbl=null, flags=, super=GridDistributedTxPrepareRequest [threadId=3430, concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, writeVer=GridCacheVersion [topVer=200084669, order=1588652522452, nodeOrder=3], timeout=0, reads=ArrayList [], writes=ArrayList [IgniteTxEntry [txKey=null, val=TxEntryValueHolder [val=CacheObjectImpl [val=null, hasValBytes=true], op=UPDATE], prevVal=TxEntryValueHolder [val=null, op=NOOP], oldVal=TxEntryValueHolder [val=null, op=NOOP], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=null, filtersPassed=false, filtersSet=false, entry=null, prepared=0, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null, xidVer=null]], dhtVers=null, txSize=0, plc=2, txState=null, flags=onePhase|last|sys, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=200084669, order=1588652522452, nodeOrder=3], committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheIdMessage [cacheId=0, super=GridCacheMessage [msgId=59317224, depInfo=null, lastAffChangedTopVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], err=null, skipPrepare=false]]]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearSingleGetRequest [futId=1588665032453, key=KeyCacheObjectImpl [part=385, val=null, hasValBytes=true], flags=1, topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], subjId=2ab5710f-6568-4940-b3cc-ce756a634f4e, taskNameHash=0, createTtl=-1, accessTtl=-1, txLbl=null, mvccSnapshot=null]]]]
    Deadlock: false
    Completed: 1094133
Thread [name="sys-stripe-1-#2", id=29, state=RUNNABLE, blockCnt=32, waitCnt=1280827]
        at o.a.i.i.processors.cache.persistence.wal.serializer.RecordDataV1Serializer.dataSize(RecordDataV1Serializer.java:2083)
        at o.a.i.i.processors.cache.persistence.wal.serializer.RecordDataV1Serializer.plainSize(RecordDataV1Serializer.java:386)
        at o.a.i.i.processors.cache.persistence.wal.serializer.RecordDataV2Serializer.plainSize(RecordDataV2Serializer.java:101)
        at o.a.i.i.processors.cache.persistence.wal.serializer.RecordDataV1Serializer.size(RecordDataV1Serializer.java:181)
        at o.a.i.i.processors.cache.persistence.wal.serializer.RecordV2Serializer$2.sizeWithHeaders(RecordV2Serializer.java:96)
        at o.a.i.i.processors.cache.persistence.wal.serializer.RecordV2Serializer.size(RecordV2Serializer.java:226)
        at o.a.i.i.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:837)
        at o.a.i.i.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:796)
        at o.a.i.i.processors.cache.GridCacheMapEntry.logUpdate(GridCacheMapEntry.java:4307)
        at o.a.i.i.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.remove(GridCacheMapEntry.java:6505)
        at o.a.i.i.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:6177)
        at o.a.i.i.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:5863)
        at o.a.i.i.processors.cache.persistence.tree.BPlusTree$Invoke.invokeClosure(BPlusTree.java:3820)
        at o.a.i.i.processors.cache.persistence.tree.BPlusTree$Invoke.access$5700(BPlusTree.java:3714)
        at o.a.i.i.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1969)
        at o.a.i.i.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1940)
        at o.a.i.i.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1847)
        at o.a.i.i.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1654)
        at o.a.i.i.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1637)
        at o.a.i.i.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:2436)
        at o.a.i.i.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:433)
        at o.a.i.i.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:2309)
        at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2576)
        at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update(GridDhtAtomicCache.java:2036)
        at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1854)
        at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1668)
        at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3241)
        at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:139)
        at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:273)
        at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:268)
        at o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142)
        at o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591)
        at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:392)
        at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:318)
        at o.a.i.i.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:109)
        at o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:308)
        at o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1635)
        at o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1255)
        at o.a.i.i.managers.communication.GridIoManager.access$4300(GridIoManager.java:144)
        at o.a.i.i.managers.communication.GridIoManager$8.execute(GridIoManager.java:1144)
        at o.a.i.i.managers.communication.TraceRunnable.run(TraceRunnable.java:50)
        at o.a.i.i.util.StripedExecutor$Stripe.body(StripedExecutor.java:559)
        at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:119)
        at java.lang.Thread.run(Thread.java:748)

Another type of warning

[WARNING][grid-timeout-worker-#135][G] >>> Possible starvation in striped pool.
    Thread name: sys-stripe-0-#1
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearGetRequest [futId=a1f90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, miniId=b1f90d3e171-cd1ebc9c-a326-4009-851b-1a3f2a703edd, ver=null, keyMap=null, flags=1, topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], subjId=2ab5710f-6568-4940-b3cc-ce756a634f4e, taskNameHash=0, createTtl=-1, accessTtl=-1, txLbl=null, mvccSnapshot=null]]]]
    Deadlock: false
    Completed: 1396153
Thread [name="sys-stripe-0-#1", id=28, state=TIMED_WAITING, blockCnt=61, waitCnt=1874266]
    Lock [object=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@1b08480, ownerName=null, ownerId=-1]
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
        at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871)
        at o.a.i.i.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1638)
        at o.a.i.i.processors.cache.distributed.dht.GridDhtTxPrepareFuture.onEntriesLocked(GridDhtTxPrepareFuture.java:368)
        at o.a.i.i.processors.cache.distributed.dht.GridDhtTxPrepareFuture.prepare0(GridDhtTxPrepareFuture.java:1304)
        at o.a.i.i.processors.cache.distributed.dht.GridDhtTxPrepareFuture.mapIfLocked(GridDhtTxPrepareFuture.java:709)
        at o.a.i.i.processors.cache.distributed.dht.GridDhtTxPrepareFuture.prepare(GridDhtTxPrepareFuture.java:1102)
        at o.a.i.i.processors.cache.distributed.dht.GridDhtTxLocal.prepareAsync(GridDhtTxLocal.java:410)
        at o.a.i.i.processors.cache.transactions.IgniteTxHandler.prepareNearTx(IgniteTxHandler.java:576)
        at o.a.i.i.processors.cache.transactions.IgniteTxHandler.prepareNearTx(IgniteTxHandler.java:373)
        at o.a.i.i.processors.cache.transactions.IgniteTxHandler.processNearTxPrepareRequest0(IgniteTxHandler.java:182)
        at o.a.i.i.processors.cache.transactions.IgniteTxHandler.processNearTxPrepareRequest(IgniteTxHandler.java:160)
        at o.a.i.i.processors.cache.transactions.IgniteTxHandler.access$000(IgniteTxHandler.java:122)
        at o.a.i.i.processors.cache.transactions.IgniteTxHandler$1.apply(IgniteTxHandler.java:204)
        at o.a.i.i.processors.cache.transactions.IgniteTxHandler$1.apply(IgniteTxHandler.java:202)
        at o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142)
        at o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591)
        at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:392)
        at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:318)
        at o.a.i.i.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:109)
        at o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:308)
        at o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1635)
        at o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1255)
        at o.a.i.i.managers.communication.GridIoManager.access$4300(GridIoManager.java:144)
        at o.a.i.i.managers.communication.GridIoManager$8.execute(GridIoManager.java:1144)
        at o.a.i.i.managers.communication.TraceRunnable.run(TraceRunnable.java:50)
        at o.a.i.i.util.StripedExecutor$Stripe.body(StripedExecutor.java:559)
        at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:119)
        at java.lang.Thread.run(Thread.java:748)

Pool metric

Striped thread pool [active=7, idle=57, qSize=9]

I am assuming because of the above warnings we are getting the below delay on SQL queries.

[WARNING][long-qry-#170][LongRunningQueryManager] Query execution is too long [duration=3424ms, type=MAP, distributedJoin=false, enforceJoinOrder=false, lazy=false, schema=PUBLIC, sql='SELECT
"__Z0"."ID" "__C0_0",
"__Z0"."URL" "__C0_1",
"__Z0"."SCORE" "__C0_2",
"__Z0"."APPNAME_ID" "__C0_3"
FROM "PUBLIC"."URLS" "__Z0"
WHERE "__Z0"."APPNAME_ID" = ?1
ORDER BY 3 FETCH FIRST ?2 ROWS ONLY', plan=SELECT
    __Z0.ID AS __C0_0,
    __Z0.URL AS __C0_1,
    __Z0.SCORE AS __C0_2,
    __Z0.APPNAME_ID AS __C0_3
FROM PUBLIC.URLS __Z0
    /* PUBLIC.IDX_2_URLS */
    /* scanCount: 101020 */
WHERE __Z0.APPNAME_ID = ?1
ORDER BY 3
FETCH FIRST ?2 ROWS ONLY
/* index sorted */, node=TcpDiscoveryNode [id=2ab5710f-6568-4940-b3cc-ce756a634f4e, consistentId=2ab5710f-6568-4940-b3cc-ce756a634f4e, addrs=ArrayList [0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.20.46.195], sockAddrs=HashSet [/0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /172.20.46.195:0], discPort=0, order=3, intOrder=3, lastExchangeTime=1588604898769, loc=false, ver=8.7.10#20191227-sha1:c481441d, isClient=true], reqId=292181, segment=0]

We don't think the there is any issue with query because we ran a read/write speed test on big data and found the query time to be under 10ms. But here it's getting delayed with even less data in database.

After 12 hours of uptime, checkpointing is taking approximately 1 minute to finish.

There are a few more warnings that are being printed in logs often. Please find them belo

2 checkpoint pages were not written yet due to unsuccessful page write lock acquisition and will be retried
Throttling is applied to page modifications [percentOfPartTime=0.62, markDirty=2440 pages/sec, checkpointWrite=1971 pages/sec, estIdealMarkDirty=0 pages/sec, curDirty=0.00, maxDirty=0.02, avgParkTime=253807 ns, pages: (total=132474, evicted=0, written=831, synced=0, cpBufUsed=543, cpBufTotal=259107)]
[sys-stripe-38-#39][GridContinuousProcessor] Failed to wait for ack message. [node=2ab5710f-6568-4940-b3cc-ce756a634f4e, routine=b24f7959-546b-4242-81d4-c51de3ce0fc2]
Page replacements started, pages will be rotated with disk, this will affect storage performance (consider increasing DataRegionConfiguration#setMaxSize for data region)

Upvotes: 2

Views: 985

Answers (2)

alamar
alamar

Reputation: 19313

In addition to answers on user list, I think you are running out of checkpoint page buffer when doing intensive writes.

I recommend increasing checkpointPageBufferSize attribute if your data regions. It starts at 20% of your data region size if not specified, I think.

Upvotes: 1

dmagda
dmagda

Reputation: 1785

The answer is provided on the Ignite user list.

Upvotes: 1

Related Questions