g.pickardou
g.pickardou

Reputation: 35873

Cassandra Data Centers and Cluster(s) Ring(s) relation

I have a Cassandra cluster with 8 nodes in 2 datacenters respectively 4-4 nodes in DC1 and DC2.

I've created a keyspace:

CREATE KEYSPACE mykeyspace 
  WITH REPLICATION = { 
   'class' : 'NetworkTopologyStrategy', 
   'DC1' : 2,
   'DC2' : 2,
  };

As far as I understand, both DC1 and DC2 will have all the data, with other words in case of whole DC1 goes offline, DC2 will capable to serve all data.

Question

Should we say that based on the previous fact both DC1 and DC2 are a "complete" ring in their own? (regarding the whole hash -2^63-1 .. +2^63 will be presented by nodes on DC1 and the same is true for DC2)

Why I am asking this?

My answer would be no, this is still one cluster, so one ring, regardless there are two subset of nodes which are contain all the data. However many image and illustrations represent the nodes in the two datacenters with two "circles" which hints the term two "rings". (obviously not two clusters)

see for example:

DataStax: Multiple datacenter write requests

PS: If it is possible do not bring to the picture the consistency levels. I understand that the inter node communication workflow depends on if the operation is write or read, and also depends on the consistency level.

A practical question which depends on the answer:

Say in DC1 num_tokens: 256 for all nodes and DC2 num_tokens: 32 for all nodes. Those numbers will be relative to each other if the 8 node are in one token ring, but in case of DC1 and DC2 are two separate token rings those number (256 and 32) are nothing to do with each other...

Upvotes: 0

Views: 346

Answers (2)

Adriano Bonacin
Adriano Bonacin

Reputation: 106

Look, if you use SimpleStrategy it work as just one ring. If you use NetworkTopology looks like two or more rings. You can use nodetool ring to see tokens vs nodes and nodetool getendpoints keyspace table partition_key to see where your partition key will be located.

[root@ip-20-0-1-226 ~]# nodetool ring

Datacenter: dc1
==========
Address     Rack        Status State   Load            Owns                Token
                                                                           8037128101152694619
20.0.1.50   rack1       Up     Normal  456.32 MiB      41.66%              -9050061154907259251
20.0.1.50   rack1       Up     Normal  456.32 MiB      41.66%              -8877859671879922723
20.0.1.226  rack2       Up     Normal  608.99 MiB      58.34%              -8871087231721285506
20.0.1.50   rack1       Up     Normal  456.32 MiB      41.66%              -8594840449446657067
20.0.1.226  rack2       Up     Normal  608.99 MiB      58.34%              -2980375791196469732
20.0.1.226  rack2       Up     Normal  608.99 MiB      58.34%              -2899706862324328975
20.0.1.50   rack1       Up     Normal  456.32 MiB      41.66%              -2406342150306062345
20.0.1.226  rack2       Up     Normal  608.99 MiB      58.34%              -2029972788998320465
20.0.1.50   rack1       Up     Normal  456.32 MiB      41.66%              -1666526652028070649
20.0.1.226  rack2       Up     Normal  608.99 MiB      58.34%              1079561723841835665
20.0.1.226  rack2       Up     Normal  608.99 MiB      58.34%              1663305819374808009
20.0.1.50   rack1       Up     Normal  456.32 MiB      41.66%              4099186620247408174
20.0.1.50   rack1       Up     Normal  456.32 MiB      41.66%              5181974457141074579
20.0.1.226  rack2       Up     Normal  608.99 MiB      58.34%              6403842400328155928
20.0.1.226  rack2       Up     Normal  608.99 MiB      58.34%              6535209989509674611
20.0.1.50   rack1       Up     Normal  456.32 MiB      41.66%              8037128101152694619
[root@ip-20-0-1-44 ~]# nodetool ring

Datacenter: dc1
==========
Address     Rack        Status State   Load            Owns                Token
                                                                           8865515588426899552
20.0.1.44   rack1       Up     Normal  337.81 MiB      100.00%             -5830638745978850993
20.0.1.44   rack1       Up     Normal  337.81 MiB      100.00%             -4570936939416887314
20.0.1.44   rack1       Up     Normal  337.81 MiB      100.00%             -4234199013293852138
20.0.1.44   rack1       Up     Normal  337.81 MiB      100.00%             -3212848663801274832
20.0.1.44   rack1       Up     Normal  337.81 MiB      100.00%             -2683544040240894822
20.0.1.44   rack1       Up     Normal  337.81 MiB      100.00%             6070021776298348267
20.0.1.44   rack1       Up     Normal  337.81 MiB      100.00%             7319793018057117390
20.0.1.44   rack1       Up     Normal  337.81 MiB      100.00%             8865515588426899552

Datacenter: dc2
==========
Address     Rack        Status State   Load            Owns                Token
                                                                           7042359221330965349
20.0.1.150  rack1       Up     Normal  323.66 MiB      100.00%             -6507323776677663977
20.0.1.150  rack1       Up     Normal  323.66 MiB      100.00%             -2315356636250039239
20.0.1.150  rack1       Up     Normal  323.66 MiB      100.00%             -2097227748877766854
20.0.1.150  rack1       Up     Normal  323.66 MiB      100.00%             -630561501032529888
20.0.1.150  rack1       Up     Normal  323.66 MiB      100.00%             2580829093211157045
20.0.1.150  rack1       Up     Normal  323.66 MiB      100.00%             4687230732027490213
20.0.1.150  rack1       Up     Normal  323.66 MiB      100.00%             4817758060672762980
20.0.1.150  rack1       Up     Normal  323.66 MiB      100.00%             7042359221330965349
[root@ip-20-0-1-44 ~]# nodetool getendpoints qa eventsrawtest "host1","2019-03-29","service1"
20.0.1.150
20.0.1.44
CREATE KEYSPACE qa WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': '1', 'dc2': '1'}  AND durable_writes = true;
CREATE TABLE eventsrawtest (
        host text,
        bucket_time text,
        service text,
        time timestamp,
        metric double,
        state text,
        PRIMARY KEY ((host, bucket_time, service), time)
  ) WITH CLUSTERING ORDER BY (time DESC)

Upvotes: 1

Adriano Bonacin
Adriano Bonacin

Reputation: 106

The short answer is: Both DC will have 2 replicas. Then 4 replicas for your data.

Cassandra is smart enough to understand your topology and distribute data.

It's also important distribute data between racks (rack awareness), since Cassandra will write one replica in each rack. Then you will be sure that your data is spread and you can loose up to 6 nodes without losing data (considering all your keyspaces with mentioned replication factor).

DC1
- rack1
-- 2 nodes
- rack2
-- 2 nodes

DC2
- rack1
-- 2 nodes
- rack2
-- 2 nodes

Finally, you can have distinct num_tokens between DCs. It will not affect replication factor. If you can check doc, it's recommended a smaller value. https://cassandra.apache.org/doc/latest/cassandra/getting_started/production.html

Upvotes: 0

Related Questions