Artemiy Firsov
Artemiy Firsov

Reputation: 48

Setup cassandra cluster with 2 nodes with existing installations

I have 2 servers on which I separately installed cassandra Each node has its tables, all of them with replication factor = 1.

So now I want to connect those 2 servers to a cluster. Can I do that preserving the data and what the pipeline is going to be?

Can you please advise?

Upvotes: 1

Views: 138

Answers (1)

Alex Ott
Alex Ott

Reputation: 87329

You can't do this in "online" fashion, as they really belong to 2 different clusters, with their own cluster ID, etc. The way you can do it is following (depending on amount of data, in this list "cluster 2" is a node with less data):

  • stop all applications that uses cluster 2
  • make copy of schema of the cluster 2, for example, with cqlsh -e 'describe schema;' > schema.cql
  • shutdown the cluster 2 node using nodetool drain first (required!), and then performing stop
  • move content of the data directories somewhere else, making sure that no old data is left (also check commit logs, hints, etc.)
  • modify configuration of the node of cluster 2 - set cluster name to cluster 1 name, point to cluster 1 node as a seed (it's very important, not to use cluster 2 node as a seed!)
  • start the node of cluster 2 - it will start procedure of joining to cluster 1, and streaming the data from it
  • after node 2 is shown as UN in the nodetool status, you can start to copy data:
    • if cluster 2 had keyspaces & tables with different structure, create them manually using schema saved in the first steps. If tables had the same name, but different structure, like different field types, primary key, etc., create new tables for them, as sstableloader won't be able stream data to tables with different structures
    • stream data from the saved data directories of cluster 2 node using the sstableloader

Another way could be similar - stream data first to node of cluster 1, then wipe data directories on cluster 2, and join it...

Upvotes: 1

Related Questions