Reputation: 728
I have started using Casandra since last few days and here is what I am trying to do.
I have about 2 Million+ objects which maintain profiles of users. I convert these objects to json, compress and store them in a blob column. The average compressed json size is about 10KB. This is how my table looks in cassandra,
Table:
dev.userprofile (uid varchar primary key, profile blob);
Select Query: select profile from dev.userprofile where uid='';
Update Query:
update dev.userprofile set profile='<bytebuffer>' where uid = '<uid>'
Every hour, I get events from a queue which I apply to my userprofile object. Each event corresponds to one userprofile object. I get about 1 Million of such events, so I have to update around 1M of the userprofile objects within a short time i.e update the object in my application, compress the json and update the cassandra blob. I have to finish updating all of 1 Million user profile objects preferably in few mins. But I notice its taking longer now.
While running my application, I notice that I can update around 400 profiles/second on an average. I already see a lot of CPU iowait - 70%+ on cassandra instance. Also, the load initially is pretty high around 16 (on 8 vcpu instance) and then drops off to around 4.
What am I doing wrong? Because, when I was updating smaller objects of size 2KB I noticed that cassandra operations /sec is much faster. I was able to get about 3000 Ops/sec. Any thoughts on how I should improve the performance?
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>3.1.0</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-extras</artifactId>
<version>3.1.0</version>
</dependency>
I just have one node of cassandra setup within a m4.2xlarge aws instance for testing
Single node Cassandra instance
m4.2xlarge aws ec2
500 GB General Purpose (SSD)
IOPS - 1500 / 10000
nodetool cfstats output
Keyspace: dev
Read Count: 688795
Read Latency: 27.280683695439137 ms.
Write Count: 688780
Write Latency: 0.010008401811899301 ms.
Pending Flushes: 0
Table: userprofile
SSTable count: 9
Space used (live): 32.16 GB
Space used (total): 32.16 GB
Space used by snapshots (total): 0 bytes
Off heap memory used (total): 13.56 MB
SSTable Compression Ratio: 0.9984539538554672
Number of keys (estimate): 2215817
Memtable cell count: 38686
Memtable data size: 105.72 MB
Memtable off heap memory used: 0 bytes
Memtable switch count: 6
Local read count: 688807
Local read latency: 29.879 ms
Local write count: 688790
Local write latency: 0.012 ms
Pending flushes: 0
Bloom filter false positives: 47
Bloom filter false ratio: 0.00003
Bloom filter space used: 7.5 MB
Bloom filter off heap memory used: 7.5 MB
Index summary off heap memory used: 2.07 MB
Compression metadata off heap memory used: 3.99 MB
Compacted partition minimum bytes: 216 bytes
Compacted partition maximum bytes: 370.14 KB
Compacted partition mean bytes: 5.82 KB
Average live cells per slice (last five minutes): 1.0
Maximum live cells per slice (last five minutes): 1
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1
nodetool cfhistograms output
Percentile SSTables Write Latency Read Latency Partition Size Cell Count
(micros) (micros) (bytes)
50% 3.00 9.89 2816.16 4768 2
75% 3.00 11.86 43388.63 8239 2
95% 4.00 14.24 129557.75 14237 2
98% 4.00 20.50 155469.30 17084 2
99% 4.00 29.52 186563.16 20501 2
Min 0.00 1.92 61.22 216 2
Max 5.00 74975.55 4139110.98 379022 2
Dstat output
---load-avg--- --io/total- ---procs--- ------memory-usage----- ---paging-- -dsk/total- ---system-- ----total-cpu-usage---- -net/total-
1m 5m 15m | read writ|run blk new| used buff cach free| in out | read writ| int csw |usr sys idl wai hiq siq| recv send
12.8 13.9 10.6|1460 31.1 |1.0 14 0.2|9.98G 892k 21.2G 234M| 0 0 | 119M 3291k| 63k 68k| 1 1 26 72 0 0|3366k 3338k
13.2 14.0 10.7|1458 28.4 |1.1 13 1.5|9.97G 884k 21.2G 226M| 0 0 | 119M 3278k| 61k 68k| 2 1 28 69 0 0|3396k 3349k
12.7 13.8 10.7|1477 27.6 |0.9 11 1.1|9.97G 884k 21.2G 237M| 0 0 | 119M 3321k| 69k 72k| 2 1 31 65 0 0|3653k 3605k
12.0 13.7 10.7|1474 27.4 |1.1 8.7 0.3|9.96G 888k 21.2G 236M| 0 0 | 119M 3287k| 71k 75k| 2 1 36 61 0 0|3807k 3768k
11.8 13.6 10.7|1492 53.7 |1.6 12 1.2|9.95G 884k 21.2G 228M| 0 0 | 119M 6574k| 73k 75k| 2 2 32 65 0 0|3888k 3829k
Edit
Switched to LeveledCompactionStrategy & disabled compression on sstables, I don't see a big improvement:
There was a bit of improvement in profiles/sec updated. Its now 550-600 profiles /sec. But, the cpu spikes remain i.e the iowait.
gcstats
Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed (ms) GC Reclaimed (MB) Collections Direct Memory Bytes
755960 83 3449 8 73179796264 107 -1
dstats
---load-avg--- --io/total- ---procs--- ------memory-usage----- ---paging-- -dsk/total- ---system-- ----total-cpu-usage---- -net/total-
1m 5m 15m | read writ|run blk new| used buff cach free| in out | read writ| int csw |usr sys idl wai hiq siq| recv send
7.02 8.34 7.33| 220 16.6 |0.0 0 1.1|10.0G 756k 21.2G 246M| 0 0 | 13M 1862k| 11k 13k| 1 0 94 5 0 0| 0 0
6.18 8.12 7.27|2674 29.7 |1.2 1.5 1.9|10.0G 760k 21.2G 210M| 0 0 | 119M 3275k| 69k 70k| 3 2 83 12 0 0|3906k 3894k
5.89 8.00 7.24|2455 314 |0.6 5.7 0|10.0G 760k 21.2G 225M| 0 0 | 111M 39M| 68k 69k| 3 2 51 44 0 0|3555k 3528k
5.21 7.78 7.18|2864 27.2 |2.6 3.2 1.4|10.0G 756k 21.2G 266M| 0 0 | 127M 3284k| 80k 76k| 3 2 57 38 0 0|4247k 4224k
4.80 7.61 7.13|2485 288 |0.1 12 1.4|10.0G 756k 21.2G 235M| 0 0 | 113M 36M| 73k 73k| 2 2 36 59 0 0|3664k 3646k
5.00 7.55 7.12|2576 30.5 |1.0 4.6 0|10.0G 760k 21.2G 239M| 0 0 | 125M 3297k| 71k 70k| 2 1 53 43 0 0|3884k 3849k
5.64 7.64 7.15|1873 174 |0.9 13 1.6|10.0G 752k 21.2G 237M| 0 0 | 119M 21M| 62k 66k| 3 1 27 69 0 0|3107k 3081k
You could notice the cpu spikes.
My main concern is iowait before I increase the load further. Anything specific I should looking for thats causing this? Because, 600 profiles / sec (i.e 600 Reads + Writes) seems low to me.
Upvotes: 0
Views: 1735
Reputation: 171
I have the same problem, after hundreds of tries, I have solved the problem, it's the readahead settings as it may be set too high :
sudo blockdev --report
my system's default is 4M, then I trun it into 64k, it works! the average read latency improved from more than 100ms to less than 2ms.
Upvotes: 1
Reputation: 16400
Can you try LeveledCompactionStrategy? With 1:1 read/writes on large objects like this the IO saved on reads will probably counter the IO spent on the more expensive compactions.
If your already compressing the data before sending it, you should turn off compression on the table. Its breaking it into 64kb chunks which will be largely dominated by only 6 values which wont get much compression (as shown in horrible compression ratio SSTable Compression Ratio: 0.9984539538554672
).
ALTER TABLE dev.userprofile
WITH compaction = { 'class' : 'LeveledCompactionStrategy' }
AND compression = { 'sstable_compression' : '' };
400 profiles/second is very very slow though and there may be some work to do on your client that could potentially be bottleneck as well. If you have a 4 load on a 8 core system its may not Cassandra slowing things down. Make sure your parallelizing your requests and using them asynchronously, sending requests sequentially is a common issue.
With larger blobs there is going to be an impact on GCs, so monitoring them and adding that information can be helpful. I would be surprised for 10kb objects to affect it that much but its something to look out for and may require more JVM tuning.
If that helps, from there I would recommend tuning the heap and upgrading to at least 3.7 or latest in 3.0 line.
Upvotes: 1