cassandra stateful set in kubernetes

I've been trying to setup a redundant stateful set in kubernetes with the google cassandra image, as depicted in kubernetes 1.7 documentation.

According to the image used It's a stateful set with a consistency level of ONE. In my testing example I'm using a SimpleStrategy replication with a replication factor of 3, as I have setup 3 replicas in the stateful set in one datacenter only. I've defined cassandra-0,cassandra-1,cassandra-2 as seeds, so all are seeds.

I've created a keyspace and a table:

"create keyspace if not exists testing with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 }"

"create table testing.test (id uuid primary key, name text, age int, properties map<text,text>, nickames set<text>, goals_year map<int,int>, current_wages float, clubs_season tuple<text,int>);"

I am testing with inserting data from another unrelated pod, using the cqlsh binary, and I can see that data ends up in every container, so replication is successfull. nodetool status on all pods comes up with:

Datacenter: DC1-K8Demo
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.16.0.161  71.04 KiB  32           100.0%            4ad4e1d3-f984-4f0c-a349-2008a40b7f0a  Rack1-K8Demo
UN  10.16.0.162  71.05 KiB  32           100.0%            fffca143-7ee8-4749-925d-7619f5ca0e79  Rack1-K8Demo
UN  10.16.2.24   71.03 KiB  32           100.0%            975a5394-45e4-4234-9a97-89c3b39baf3d  Rack1-K8Demo

...and all cassandra pods have the same data in the table created before:

 id                                   | age | clubs_season | current_wages | goals_year | name     | nickames | properties
--------------------------------------+-----+--------------+---------------+------------+----------+----------+--------------------------------------------------
 b6d6f230-c0f5-11e7-98e0-e9450c2870ca |  26 |         null |          null |       null | jonathan |     null | {'goodlooking': 'yes', 'thinkshesthebest': 'no'}
 5fd02b70-c0f8-11e7-8e29-3f611e0d5e94 |  26 |         null |          null |       null | jonathan |     null | {'goodlooking': 'yes', 'thinkshesthebest': 'no'}
 5da86970-c0f8-11e7-8e29-3f611e0d5e94 |  26 |         null |          null |       null | jonathan |     null | {'goodlooking': 'yes', 'thinkshesthebest': 'no'}

But then I delete one of those db replica pods(cassandra-0), a new pod springs up again as expected, a new cassandra-0 (thanks kubernetes!), and I see now that all the pods have lost one row of those 3:

 id                                   | age | clubs_season | current_wages | goals_year | name     | nickames | properties
--------------------------------------+-----+--------------+---------------+------------+----------+----------+--------------------------------------------------
 5fd02b70-c0f8-11e7-8e29-3f611e0d5e94 |  26 |         null |          null |       null | jonathan |     null | {'goodlooking': 'yes', 'thinkshesthebest': 'no'}
 5da86970-c0f8-11e7-8e29-3f611e0d5e94 |  26 |         null |          null |       null | jonathan |     null | {'goodlooking': 'yes', 'thinkshesthebest': 'no'}

...and nodetool status now comes up with:

 Datacenter: DC1-K8Demo
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.16.0.161  71.04 KiB  32           81.7%             4ad4e1d3-f984-4f0c-a349-2008a40b7f0a  Rack1-K8Demo
UN  10.16.0.162  71.05 KiB  32           78.4%             fffca143-7ee8-4749-925d-7619f5ca0e79  Rack1-K8Demo
DN  10.16.2.24   71.03 KiB  32           70.0%             975a5394-45e4-4234-9a97-89c3b39baf3d  Rack1-K8Demo
UN  10.16.2.28   85.49 KiB  32           69.9%             3fbed771-b539-4a44-99ec-d27c3d590f18  Rack1-K8Demo

... shouldn't the cassandra ring replicate all the data into the newly created pod, and still have the 3 rows there in all cassandra pods?

... this experience is documented in github.

...has someone tried this experience, what might be wrong in this testing context?

super thanks in advance

Upvotes: 0

Views: 857

Answers (1)

Horia
Horia

Reputation: 2982

I think that after bringing down the node, you need to inform the other peers from the cluster that the node is dead and needs replacing.

I would recommend some reading in order to have a correct test case.

Upvotes: 2

Related Questions