Julien Fouilhé
Julien Fouilhé

Reputation: 2658

Cassandra: Data loss after adding new node

We had a two nodes cassandra cluster which we want to expand to four. We followed the procedure described there: http://www.datastax.com/documentation/cassandra/1.2/cassandra/operations/ops_add_node_to_cluster_t.html

But after adding two more nodes (at same time, with a 2 minutes interval as recommended in the documentation), we experienced some data loss. In some column families, there was missing elements.

Here is the nodetool netstats:

[centos@ip-10-11-11-187 ~]$ nodetool status
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens  Owns    Host ID                               Rack
UN  10.11.11.187  5.63 MB    256     ?       0e596912-d649-4eed-82a4-df800c422634  2c
UN  10.11.1.104   748.79 MB  256     ?       d8b96739-0858-4926-9eb2-27c96ca0a1c4  2c
UN  10.11.11.24   7.11 MB    256     ?       e3e76dcf-2c39-42e5-a34e-9e986d4a9f7c  2c
UN  10.11.1.231   878.91 MB  256     ?       cc1b5cfd-c9d0-4ca9-bbb1-bce4b2deffc1  2c

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless

I don't quite understand if the "Note" is bad or not.

When we added the nodes, we put the two first servers - already available in the cluster - in the seeds of the configuration of the first added node. For the second added node, we put the newly added node in the seeds as well.

We are using EC2Snitch, and the listen_address has been set to the above addresses on each server.

We didn't run the cleanup yet, but we tried to run a repair, and there was written that nothing were to be repaired in our keycap.

Here is how our cluster was created:

CREATE KEYSPACE keyspace_name WITH replication = {'class': 'NetworkTopologyStrategy', 'us-west-2': '1'}  AND durable_writes = true;

And the options of all of our tables:

CREATE TABLE keyspace_name."CFName" (
    // ...
) WITH bloom_filter_fp_chance = 0.01
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
    AND comment = ''
    AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

The data reappears if I decommission the new nodes.

EDIT: It was actually an error when reading the documentation... A colleague did set auto_bootstrap to false instead of setting it to true...

Upvotes: 2

Views: 3999

Answers (2)

Tushar
Tushar

Reputation: 578

Well you can specify the keyspace name to remove the Node which in this case is

nodetool status keyspace_name

Upvotes: 0

Roman Tumaykin
Roman Tumaykin

Reputation: 1931

You should perform nodetool rebuild on the new nodes after you add them with auto_bootstrap: false

http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsRebuild.html

HTH

Upvotes: 2

Related Questions