Reputation: 127791
I want to use Docker to start my application and Cassandra database, and I would like to use Docker Compose for that. Unfortunately, Cassandra starts much slower than my application, and since my application eagerly initializes the Cluster
object, I get the following exception:
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: cassandra/172.18.0.2:9042 (com.datastax.driver.core.exceptions.TransportException: [cassandra/172.18.0.2:9042] Cannot connect))
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:233)
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:79)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1454)
at com.datastax.driver.core.Cluster.init(Cluster.java:163)
at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:334)
at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:309)
at com.datastax.driver.core.Cluster.connect(Cluster.java:251)
According to the stacktrace and a little debugging, it seems that Cassandra Java driver does not apply retry policies to the initial startup. This seems kinda weird to me. Is there a way for me to configure the driver so it will continue its attempts to connect to the server until it succeeds?
Upvotes: 27
Views: 4113
Reputation: 1097
refer: https://stackoverflow.com/a/69612290/10428392
You could modify docker compose file like this with a health check.
version: '3.8'
services:
applicaion-service:
image: your-applicaion-service:0.0.1
depends_on:
cassandra:
condition: service_healthy
cassandra:
image: cassandra:4.0.1
ports:
- "9042:9042"
healthcheck:
test: ["CMD", "cqlsh", "-u cassandra", "-p cassandra" ,"-e describe keyspaces"]
interval: 15s
timeout: 10s
retries: 10
Upvotes: 0
Reputation: 1
If you orchestrating many dockers, you should go for a docker compose with depends on tag
version: '2'
services:
cassandra:
image: cassandra:3.5
ports:
- "9042:9042"
- "9160:9160"
environment:
CASSANDRA_CLUSTER_NAME: demo
app:
image: your-app
restart: unless-stopped
depends_on:
- cassandra
Upvotes: -1
Reputation: 139
The Datastax driver cannot be configured this way.
If this is only a problem with Docker and you do not wish to change your code, you could consider using something such as wait-for-it which is a simple script which will wait for a TCP port to be listening before starting your application. 9042 is cassandra's native transport port.
Other options are discussed here in the docker documentation, but I personally have only used wait-for-it but found it to be useful when working with cassandra within docker.
Upvotes: 1
Reputation: 1179
If you don't want to change your client code, and your client application's docker container stops because of the error you can use the following attribute for the client app in your docker-compose file.
restart: unless-stopped
That will restart your client application container as many times as it fails. Example docker-compose.yml file:
version: '2'
services:
cassandra:
image: cassandra:3.5
ports:
- "9042:9042"
- "9160:9160"
environment:
CASSANDRA_CLUSTER_NAME: demo
app:
image: your-app
restart: unless-stopped
Upvotes: 2
Reputation: 776
You should be able to write some try/catch logic on the NoHostAvailableException to retry the connection after a 5-10 second wait. I would recommend only doing this a few times before throwing the exception after a certain time period where you know that it should have started by that point.
Example pseudocode
Connection makeCassandraConnection(int retryCount) {
Exception lastException = new IllegalStateException();
while (retryCount > 0) {
try {
return doConnectionStuff();
} catch (NoHostAvailableException e) {
lastException = e;
retryCount--;
Thread.sleep(TimeUnit.SECONDS.toMillis(5));
}
}
throw lastException;
}
Upvotes: 13
Reputation: 29150
Try increasing the connection timeout, it's the one thing sometimes happens on AWS and the like. I think you're looking at a late stage in the error log, at some point it should tell you it couldn't connect because of a timeout or unreachable network, and then it flags nodes as not available.
Using phantom, code is like below:
val Connector = ContactPoints(Seq(seedHost))
.withClusterBuilder(_.withSocketOptions(
new SocketOptions()
.setReadTimeoutMillis(1500)
.setConnectTimeoutMillis(20000)
)).keySpace("bla")
com.datastax.driver.core.exceptions.NoHostAvailableException #445
Upvotes: -2