snowindy
snowindy

Reputation: 3251

How to properly connect client application to Scylla or Cassandra?

Let's say I have a cluster of 3 nodes for ScyllaDB in my local network (it can be AWS VPC). I have my Java application running in the same local network.

I am concerned how to properly connect app to DB.

I would be much grateful for a code sample of how to connect Java app to multi-node cluster.

Upvotes: 3

Views: 1917

Answers (3)

Aaron
Aaron

Reputation: 57843

Answering the specific questions:

Do I need to specify all 3 IP addresses of DB nodes for the app?

No. Your app just needs one to work. But it might not be a bad idea to have a few, just in case one is down.

What if over time one or several nodes die and get resurrected on other IPs?

As long as your app doesn't stop, it maintains its own version of gossip. So it will see the new nodes being added and connect to them as it needs to.

Do I have to manually reconfigure application?

If you're specifying IP addresses, yes.

How is it done properly in big real production cases with tens of DB servers, possibly in different data centers?

By abstracting the need for a specific IP, using something like Consul. If you wanted to, you could easily build a simple restful service to expose an inventory list or even the results of nodetool status.

Upvotes: 1

Greg
Greg

Reputation: 750

Alex Ott's answer is correct, but I wanted to add a bit more background so that it doesn't look arbitrary.

The selection of the 2 or 3 nodes to connect to is described at https://docs.scylladb.com/kb/seed-nodes/

However, going forward, Scylla is looking to move away from differentiating between Seed and non-Seed nodes. So, in future releases, the answer will likely be different. Details on these developments at: https://www.scylladb.com/2020/09/22/seedless-nosql-getting-rid-of-seed-nodes-in-scylla/

Upvotes: 1

Alex Ott
Alex Ott

Reputation: 87369

You need to specify contact points (you can use DNS names instead of IPs) - several nodes (usually 2-3), and driver will connect to one of them, and will discover the all nodes of the cluster after connection (see the driver's documentation). After connection is established, driver keeps the separate control connection opened, and via it receives the information about nodes that are going up & down, joining or leaving the cluster, etc., so it's able to keep information about cluster topology up-to-date.

If you're specifying DNS names instead of the IP addresses, then it's better to specify configuration parameter datastax-java-driver.advanced.resolve-contact-points as true (see docs), so the names will be resolved to IPs on every reconnect, instead of resolving at the start of application.

Upvotes: 5

Related Questions