Reputation: 127791

Retry connection to Cassandra node upon startup

I want to use Docker to start my application and Cassandra database, and I would like to use Docker Compose for that. Unfortunately, Cassandra starts much slower than my application, and since my application eagerly initializes the Cluster object, I get the following exception:

com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: cassandra/172.18.0.2:9042 (com.datastax.driver.core.exceptions.TransportException: [cassandra/172.18.0.2:9042] Cannot connect))
    at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:233)
    at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:79)
    at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1454)
    at com.datastax.driver.core.Cluster.init(Cluster.java:163)
    at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:334)
    at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:309)
    at com.datastax.driver.core.Cluster.connect(Cluster.java:251)

According to the stacktrace and a little debugging, it seems that Cassandra Java driver does not apply retry policies to the initial startup. This seems kinda weird to me. Is there a way for me to configure the driver so it will continue its attempts to connect to the server until it succeeds?

Upvotes: 27

Answers (6)

counter2015

Reputation: 1097

refer: https://stackoverflow.com/a/69612290/10428392

You could modify docker compose file like this with a health check.

version: '3.8'
services:
  applicaion-service:
    image: your-applicaion-service:0.0.1
    depends_on:
      cassandra:
        condition: service_healthy


  cassandra:
    image: cassandra:4.0.1
    ports:
      - "9042:9042"
    healthcheck:
      test: ["CMD", "cqlsh", "-u cassandra", "-p cassandra" ,"-e describe keyspaces"]
      interval: 15s
      timeout: 10s
      retries: 10

Upvotes: 0

Leandro Jacomelli

Reputation: 1

If you orchestrating many dockers, you should go for a docker compose with depends on tag

version: '2'
services:
  cassandra:
    image: cassandra:3.5
    ports:
      - "9042:9042"
      - "9160:9160"
    environment:
      CASSANDRA_CLUSTER_NAME: demo
  app:
    image: your-app
    restart: unless-stopped
    depends_on:
      - cassandra

Upvotes: -1

Samyel

Reputation: 139

The Datastax driver cannot be configured this way.

If this is only a problem with Docker and you do not wish to change your code, you could consider using something such as wait-for-it which is a simple script which will wait for a TCP port to be listening before starting your application. 9042 is cassandra's native transport port.

Other options are discussed here in the docker documentation, but I personally have only used wait-for-it but found it to be useful when working with cassandra within docker.

Upvotes: 1

medvekoma

Reputation: 1179

If you don't want to change your client code, and your client application's docker container stops because of the error you can use the following attribute for the client app in your docker-compose file.

restart: unless-stopped

That will restart your client application container as many times as it fails. Example docker-compose.yml file:

version: '2'
services:
  cassandra:
    image: cassandra:3.5
    ports:
      - "9042:9042"
      - "9160:9160"
    environment:
      CASSANDRA_CLUSTER_NAME: demo
  app:
    image: your-app
    restart: unless-stopped

Upvotes: 2

gsteiner

Reputation: 776

You should be able to write some try/catch logic on the NoHostAvailableException to retry the connection after a 5-10 second wait. I would recommend only doing this a few times before throwing the exception after a certain time period where you know that it should have started by that point.

Example pseudocode

Connection makeCassandraConnection(int retryCount) {
    Exception lastException = new IllegalStateException();
    while (retryCount > 0) {
        try {
            return doConnectionStuff();
        } catch (NoHostAvailableException e) {
            lastException = e;
            retryCount--;
            Thread.sleep(TimeUnit.SECONDS.toMillis(5));
        }
    }
    throw lastException;
}

Upvotes: 13

SkyWalker

Reputation: 29150

Try increasing the connection timeout, it's the one thing sometimes happens on AWS and the like. I think you're looking at a late stage in the error log, at some point it should tell you it couldn't connect because of a timeout or unreachable network, and then it flags nodes as not available.

Using phantom, code is like below:

val Connector = ContactPoints(Seq(seedHost))
    .withClusterBuilder(_.withSocketOptions(
      new SocketOptions()
      .setReadTimeoutMillis(1500)
      .setConnectTimeoutMillis(20000)
    )).keySpace("bla")

Resource Link:

com.datastax.driver.core.exceptions.NoHostAvailableException #445

Upvotes: -2

Retry connection to Cassandra node upon startup

Answers (6)

Resource Link:

Related Questions