Reputation: 51
We are experiencing the issue when trying to connect to the cluster after updating the version of Java SDK.
The setup of the system is as follows:
We have a web application that is using Java SDK and a Couchbase cluster. In between we have a VIP (Virtual IP Address). We realise that isn’t ideal but we’re not able to change that immediately since VIP was mandated by Tech Ops. VIP is basically only there to reroute the initial request on application startup. That way we can make modifications on the cluster and ensure that when application starts it can find the cluster regardless of the actual nodes in the cluster and their IPs.
Prior to the issue we used JAVA SDK version 1.4.4. Our application would start and Java SDK would initiate a request on port 8091 to VIP. Please note that port 8091 is the only port open on VIP. VIP would reroute the request to one of the node cluster currently in use the cluster would respond to Java SDK. At that point Java SDK would discover all the nodes in the cluster and application would run fine. During up time if we would add, remove a node from the cluster Java SDK would update automatically and everything would run without the issue.
In the last sprint we updated the Java SDK to version 2.1.3. Our application would start and Java SDK would initiate a request on port 11210 to VIP. Since this port is not open the request would fail and Java SDK would throw an exception:
Caused by: java.lang.RuntimeException: java.util.concurrent.TimeoutException
at com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:93)
at com.couchbase.client.java.CouchbaseCluster.openBucket(CouchbaseCluster.java:108)
at com.couchbase.client.java.CouchbaseCluster.openBucket(CouchbaseCluster.java:99)
at com.couchbase.client.java.CouchbaseCluster.openBucket(CouchbaseCluster.java:89)
No further request would be made on any port.
It appears the order in which port are being used has been changed between versions. Could somebody please confirm, or dispute, that the order in which ports are being used for cluster discovery has been changed between versions. Also could somebody please provide some advice on how we could resolve the issue. We are trying to understand the clients behavior, if we could open all those ports on the VIP would the client still then function correctly and at full performance?
The issue is happening on our production environment which we cannot use for testing out potential solutions since it will interfere with our products.
Upvotes: 4
Views: 669
Reputation: 28301
in addition to Kirk's long term answer (which I also advise you to follow), a shorter term solution may be to deactivate the 11210 bootstraping (carrier bootstrap) through the CouchbaseEnvironment
by calling bootstrapCarrierEnabled(false)
on the builder.
I don't guarantee that it'll work with a vIP even after that, but that may be worth a try if you're in a hurry.
Upvotes: 1
Reputation: 4855
In v2.x of the Java SDK, it defaults to 11210 to get the cluster map to bootstrap the application. This is a huge improvement actually as now the map comes from the managed cache and not the cluster manager (8091). The SDK should use 8091 as a fall back if it cannot get the map on 11210 though. Regardless, you really want to get that map from 11210, trust me. It cleans up a lot of problems.
To resolve this long term and follow Couchbase best practices, upgrade to the Java 2.2.x SDK, get rid of the VIP entirely and go with a DNS SRV record instead. That gives you one DNS entry for the SDK connection object and you just manage the node list in DNS. It works great. I say SDK 2.2 as the DNS SRV record solution is fully supported there, in 2.1 it is experimental. VIPs are specifically recommended against by Couchbase these days. In older versions of the SDKs it was fine to do this and it helped with limiting the number of connections from the app to the DB nodes, but that is no longer necessary and can actually be a bad thing.
Upvotes: 2