Cassandra throws OutOfMemory

Question

In our test environment, We have a 1 node cassandra cluster with RF=1 for all keyspaces.

VM arguments of interest are listed below

-XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms2G -Xmx2G -Xmn1G -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8

We noticed Full GC happening frequently and cassandra getting unresponsive during GC.

INFO  [Service Thread] 2016-12-29 15:52:40,901 GCInspector.java:252 - ParNew GC in 238ms.  CMS Old Gen: 782576192 -> 802826248; Par Survivor Space: 60068168 -> 32163264

INFO  [Service Thread] 2016-12-29 15:52:40,902 GCInspector.java:252 - ConcurrentMarkSweep GC in 1448ms.  CMS Old Gen: 802826248 -> 393377248; Par Eden Space: 859045888 -> 0; Par Survivor Space: 32163264 -> 0

We are getting java.lang.OutOfMemoryError with below exception

ERROR [SharedPool-Worker-5] 2017-01-26 09:23:13,694 JVMStabilityInspector.java:94 - JVM state determined to be unstable.  Exiting forcefully due to:
java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57) ~[na:1.7.0_80]
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) ~[na:1.7.0_80]
        at org.apache.cassandra.utils.memory.SlabAllocator.getRegion(SlabAllocator.java:137) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.utils.memory.SlabAllocator.allocate(SlabAllocator.java:97) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.utils.memory.ContextAllocator.allocate(ContextAllocator.java:57) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.utils.memory.ContextAllocator.clone(ContextAllocator.java:47) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.utils.memory.MemtableBufferAllocator.clone(MemtableBufferAllocator.java:61) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.db.Memtable.put(Memtable.java:192) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1237) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:400) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:363) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.db.Mutation.apply(Mutation.java:214) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.service.StorageProxy$7.runMayThrow(StorageProxy.java:1033) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2224) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_80]
        at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.1.8.jar:2.1.8]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]

We were able restore the cassandra after executing nodetool repair.

nodetool status

Datacenter: DC1

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens  Owns    Host ID                               Rack
UN  10.3.211.3  5.74 GB    256     ?       32251391-5eee-4891-996d-30fb225116a1  RAC1

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless

nodetool info

ID                     : 32251391-5eee-4891-996d-30fb225116a1
Gossip active          : true
Thrift active          : true
Native Transport active: true
Load                   : 5.74 GB
Generation No          : 1485526088
Uptime (seconds)       : 330651
Heap Memory (MB)       : 812.72 / 1945.63
Off Heap Memory (MB)   : 7.63
Data Center            : DC1
Rack                   : RAC1
Exceptions             : 0
Key Cache              : entries 68, size 6.61 KB, capacity 97 MB, 1158 hits, 1276 requests, 0.908 recent hit rate, 14400 save period in seconds
Row Cache              : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds
Counter Cache          : entries 0, size 0 bytes, capacity 48 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds
Token                  : (invoke with -T/--tokens to see all 256 tokens)

From System.log, I see lots of compacting large partitiion

WARN  [CompactionExecutor:33463] 2016-12-24 05:42:29,550 SSTableWriter.java:240 - Compacting large partition mydb/Table_Name:2016-12-23 00:00+0530 (142735455 bytes)
WARN  [CompactionExecutor:33465] 2016-12-24 05:47:57,343 SSTableWriter.java:240 - Compacting large partition mydb/Table_Name_2:22:0c2e6c00-a5a3-11e6-a05e-1f69f32db21c (162203393 bytes)

For Tombstone I notice below in system.log

[main] 2016-12-28 18:23:06,534 YamlConfigurationLoader.java:135 - Node configuration:[authenticator=PasswordAuthenticator; authorizer=CassandraAuthorizer; auto_snapshot=true; batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024; cas_contention_timeout_in_ms=1000; client_encryption_options=; cluster_name=bankbazaar; column_index_size_in_kb=64; commit_failure_policy=ignore; commitlog_directory=/var/cassandra/log/commitlog; commitlog_segment_size_in_mb=32; commitlog_sync=periodic; commitlog_sync_period_in_ms=10000; compaction_throughput_mb_per_sec=16; concurrent_counter_writes=32; concurrent_reads=32; concurrent_writes=32; counter_cache_save_period=7200; counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=15000; cross_node_timeout=false; data_file_directories=[/cryptfs/sdb/cassandra/data, /cryptfs/sdc/cassandra/data, /cryptfs/sdd/cassandra/data]; disk_failure_policy=best_effort; dynamic_snitch_badness_threshold=0.1; dynamic_snitch_reset_interval_in_ms=600000; dynamic_snitch_update_interval_in_ms=100; endpoint_snitch=GossipingPropertyFileSnitch; hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024; incremental_backups=false; index_summary_capacity_in_mb=null; index_summary_resize_interval_in_minutes=60; inter_dc_tcp_nodelay=false; internode_compression=all; key_cache_save_period=14400; key_cache_size_in_mb=null; listen_address=127.0.0.1; max_hint_window_in_ms=10800000; max_hints_delivery_threads=2; memtable_allocation_type=heap_buffers; native_transport_port=9042; num_tokens=256; partitioner=org.apache.cassandra.dht.Murmur3Partitioner; permissions_validity_in_ms=2000; range_request_timeout_in_ms=20000; read_request_timeout_in_ms=10000; request_scheduler=org.apache.cassandra.scheduler.NoScheduler; request_timeout_in_ms=20000; row_cache_save_period=0; row_cache_size_in_mb=0; rpc_address=127.0.0.1; rpc_keepalive=true; rpc_port=9160; rpc_server_type=sync; saved_caches_directory=/var/cassandra/data/saved_caches; seed_provider=[{class_name=org.apache.cassandra.locator.SimpleSeedProvider, parameters=[{seeds=127.0.0.1}]}]; server_encryption_options=; snapshot_before_compaction=false; ssl_storage_port=9001; sstable_preemptive_open_interval_in_mb=50; start_native_transport=true; start_rpc=true; storage_port=9000; thrift_framed_transport_size_in_mb=15; tombstone_failure_threshold=100000; tombstone_warn_threshold=1000; trickle_fsync=false; trickle_fsync_interval_in_kb=10240; truncate_request_timeout_in_ms=60000; write_request_timeout_in_ms=5000]

nodetool tpstats

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
CounterMutationStage              0         0              0         0                 0
ReadStage                        32      4061       50469243         0                 0
RequestResponseStage              0         0              0         0                 0
MutationStage                    32        22       27665114         0                 0
ReadRepairStage                   0         0              0         0                 0
GossipStage                       0         0              0         0                 0
CacheCleanupExecutor              0         0              0         0                 0
AntiEntropyStage                  0         0              0         0                 0
MigrationStage                    0         0              0         0                 0
Sampler                           0         0              0         0                 0
ValidationExecutor                0         0              0         0                 0
CommitLogArchiver                 0         0              0         0                 0
MiscStage                         0         0              0         0                 0
MemtableFlushWriter               0         0           7769         0                 0
MemtableReclaimMemory             1        57          13433         0                 0
PendingRangeCalculator            0         0              1         0                 0
MemtablePostFlush                 0         0           9279         0                 0
CompactionExecutor                3        47         169022         0                 0
InternalResponseStage             0         0              0         0                 0
HintedHandoff                     0         1            148         0                 0

Is there any YAML/other config to be used to avoid "large compaction"

What is the correct Compaction Strategy to be used ? Can OutOfMemory because of wrong Compaction Strategy

In one of the keyspace we have write once and read multiple times for each row.

For another keyspace we have Timeseries kind of data where it's insert only and multiple reads

Cassandra throws OutOfMemory

Answers (1)

Related Questions