Reputation: 5890
I have a Cassandra 1.2 cluster and I'm using virtual nodes and the ByteOrderedPartitioner. I know this is not recommended because I need to make sure the keys of the data is evenly distributed across the keyspace so the load on each physical node is properly distributed. The problem I'm having is that I can't find a way to see the actual load on each virtual node. If I use nodetool like this:
nodetool status
I receive an output like this one:
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN XXX.XXX.XXX.XXX 14.73 GB 256 11.3% a4d365ca-f21b-4418-ab0e-656520d931b5 rack1
UN XXX.XXX.XXX.XXX 8.51 GB 256 10.6% f587fe0b-e765-4c02-bd50-cef9758e9a6b rack1
UN XXX.XXX.XXX.XXX 10.92 GB 256 10.3% 6160ca91-1e07-47ec-8fa9-ef886c140e91 rack1
UN XXX.XXX.XXX.XXX 9.62 GB 256 10.0% 9c4a8476-1de2-455b-956a-c4cea31675bf rack1
UN XXX.XXX.XXX.XXX 11.11 GB 256 11.2% 61639d9c-ad49-4f38-86b3-cd48e0c90c49 rack1
UN XXX.XXX.XXX.XXX 7.86 GB 256 35.1% 195b6f79-7d68-4a98-8a9b-55bd0dd699e2 rack1
UN XXX.XXX.XXX.XXX 11.29 GB 256 11.4% 0ac03b6a-0a0e-4f83-8b9e-2f16d4db47ab rack1
Which means the distribution is not that good, but I want to see the actual distribution on the virtual nodes, the problem I'm having is that running:
nodetool ring
Gives me a lot of entries, one per each virtual node (256 in total) in the node I run the command but the information is pretty much useless because the load looks the same for each virtual node (and the actual size is unreal compared to the total information on the physical node)
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[2daad5a3e325e152d7be5bc2d5f87fef])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[2ffef9060e59c1c922a1ecf8e2643794])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[31041cc591d63d91a67a21ecf44a57c2])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[31bbcaafcdcb2ecc3a4ef3fb3af4b82b])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[324e972b43b63d63df4255e459fed524])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[3353224ae20e902e5b2b243c8fc5ff97])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[350ed29fa9a1a377b8014beef1d160f0])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[3553ad83beaf91d98a692e22718e321d])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[35893a82c84982c467251115a7406f00])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[37fad1c7dbd8d66d75747699ce4d6d2e])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[388bcf470bd5c97e1f3cb45c01bd1f2c])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[38a0cdc654a9934e5a16e5242c26fc5f])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[393b8185b527f036cd44f5f6791484b9])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[39ae4356a22bbb5ea20d5c6fc83cd2de])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[39dd01bb66beeeb46627f0303671c30d])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[3a49f707a7cea045935524900094c4e4])
XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[3a58eba6a5730a75fd899cf77c93d6cb])
My question is, is there another tool/way of getting the real load of each virtual node in a Cassandra cluster?
Thanks in advance!
Upvotes: 4
Views: 338
Reputation: 492
When you run nodetool ring
without a keyspace, it examines the load based on the SimpleStrategy for replication. If you have your tokens properly distributed for NetworkTopologyStrategy, this will look "off".
Since replication strategy determines load, and each keyspace can have a different replication strategy, you need to pass in the keyspace name as the second arg to see the true load distribution per keyspace.
If you are using the NetworkTopologyStrategy, nodetool ring <keyspace>
will take into account datacenter and rack location to determine your token distribution, and give you an accurate load value.
Upvotes: 2
Reputation: 126
Did you try with Cassandra OpsCenter? http://www.datastax.com/what-we-offer/products-services/datastax-opscenter
I'm not sure (never tried) if you can specifically get the real load of each virtual node, but it's a great tool to monitor and manage your database
Upvotes: 0