Klun
Klun

Reputation: 54

Cassandra Spark Connector : requirement failed, contact points contain multiple data centers

I have two Cassandra datacenters, with all servers in the same building, connected with 10 gbps network. The RF is 2 in each datacenter.

I need to ensure strong consistency inside my app, so I first planed to use QUORUM consistency (3 replicas of 4 must respond) on both reads and writes. With that configuration, I can also be fault tolerant if a node crash on a particular datacenter.

So I set multiples contact point from multiples datacenter to my spark connector, but the following error is immediately returned : requirement failed, contact points contain multiple data centers

So I look at the documentation. It say :

Connections are never made to data centers other than the data center of spark.cassandra.connection.host [...]. This technique guarantees proper workload isolation so that a huge analytics job won't disturb the realtime part of the system.

Okay. So after reading that, I plan to switch to LOCAL_QUORUM (2 replicas of 2 must respond) on write, and LOCAL_ONE on read, to still get strong consistency, and connect by default on datacenter1.

The problem, is still consistency, because Spark apps working on the second datacenter datacenter2 don't have strong consistency on write, because data are just asynchronously synchronized from datacenter1.

To avoid that, I can set write consistency to EACH_QUORUM (= ALL). But the problem in that case, is if a single node is unresponsive or down, the entire writes are unable to process.

So my only option, to have both some fault tolerance, AND strong consistency, is to switch my replication factor from 2 to 3 on each datacenter. Then use EACH_QUORUM on write, and LOCAL_QUORUM on read ? Is that correct ?

Thank you

Upvotes: 1

Views: 325

Answers (1)

Erick Ramirez
Erick Ramirez

Reputation: 16303

This comment indicates there is some misunderstanding on your part:

... because data are just asynchronously synchronized from datacenter1.

so allow me to clarify.

The coordinator of a write request sends each mutation (INSERT, UPDATE, DELETE) to ALL replicas in ALL data centres in real time. It doesn't happen at some later point in time (i.e. 2 seconds later, 10s later or 1 minute later) -- it gets sent to all DCs at the same time without delay regardless of whether you have a 1Mbps or 10Gbps link between DCs.

We also recommend a minimum of 3 replicas in each DC in production as well as use LOCAL_QUORUM for both reads and writes. There are very limited edge cases where these recommendations do not apply.

The spark-cassandra-connector requires all contacts points to belong to the same DC so:

  1. analytics workloads do not impact the performance of OLTP DCs (as you already pointed out), and
  2. it can achieve data-locality for optimal performance where possible.

Upvotes: 2

Related Questions