Reputation: 243
We are planning to use apache shiro & cassandra for distributed session management very similar to mentioned @ https://github.com/lhazlewood/shiro-cassandra-sample
Need advice on deployment for cassandra in Amazon EC2:
In EC2, we have below setup: Single region, 2 Availability Zones(AZ), 4 Nodes
Accordingly, cassandra is configured:
Single DataCenter: DC1
two Racks: Rack1, Rack2
4 Nodes: Rack1_Node1, Rack1_Node2, Rack2_Node1, Rack2_Node2
Data Replication Strategy used is NetworkTopologyStrategy
Since Cassandra is used as session datastore, we need high consistency and availability.
My Questions:
Upvotes: 1
Views: 542
Reputation: 368
You probably have heard of the CAP theorem in database theory. If not, You may learn the details about the theorem in wikipedia: https://en.wikipedia.org/wiki/CAP_theorem, or just google it. It says for a distributed database with multiple nodes, a database can only achieve two of the following three goals: consistency, availability and partition tolerance.
Cassandra is designed to achieve high availability and partition tolerance (AP), but sacrifices consistency to achieve that. However, you could set consistency level to all in Cassandra to shift it to CA, which seems to be your goal. Your setting of quorum 2 is essentially the same as "all" since you have 2 replicas. But in this setting, if a single node containing the data is down, the client will get an error message for read/write (not partition-tolerant).
You may take a look at a video here to learn some more (it requires a datastax account): https://academy.datastax.com/courses/ds201-cassandra-core-concepts/introduction-big-data
Upvotes: 0
Reputation: 1089
Generally going for an even number of nodes is not the best idea, as is going for an even number of availability zones. In this case, if one of the racks fails, the entire cluster will be gone. I'd recommend to go for 3 racks with 1 or 2 nodes per rack, 3 replicas and QUORUM for read and write. Then the cluster would only fail if two nodes/AZ fail.
Upvotes: 1