Hitesh
Hitesh

Reputation: 3498

Cassandra Port Exhaustion

The problem relates to when large volume of data is written to Cassandra cluster and resulting Port Exhaustion on host computer running the application writing data to Cassandra.

Details of the problem are as follows:

In our application, we are constantly writing data to Cassandra cluster made up of 3 nodes. Application is written in C# and is multithreaded. Let us assume, 100 threads open up and each threads begins to issue write operation to Cassandra using datastax C# driver. According to this document (4 simple rules when using datastax driver for cassandra), Session object is thread safe and is reused in each thread. After running the application, within few hours, we observe that "Port Exhaustion" problem occurs and the host computer running the application stops creating or accepting any other connection. After investing this issue, we think that writes issued by the each threads to Cassandra driver created individual physical connections to Cassandra cluster (upto 20k connections).

When an individual write operation is complete, the connection is closed. But the rate of "connection created" and "connection closing" is not similar. Connections are opened up much rapidly and closed relatively slower. And by the time it reaches approximately 20k open connection, the host computer will not create any further connections.

Our question is that, is it the expected behaviour of Cassandra driver/System when write instructions take longer to execute compared to new write instructions coming up, resulting in many connections kept open for longer time.

If it is the expected behaviour of Cassandra driver/System then what other alternatives can be taken? (for example, running the app on multiple machines with tasks distributed among nodes, to avoid port exhaustion.)

If it is not the expected behaviour in this scenario then we will highly appreciate you directing us to the probable solutions.

Details of the Server running the C# application : -

OS : Windows Server 2012 R2

Memory : 8 GB

Datastax Enterprise : 4.8.3

Cassandra Version : 2.1

Cassandra C# Driver Version : 3.0.5

Code for creating cluster and session

string NodeIps = "127.0.0.1,127.0.0.2,127.0.0.3";
List<string> addresses = new List<string>();
        addresses = NodeIps.Split(',').ToList();
        cluster = Cluster.Builder()
            .AddContactPoints(addresses)                                       //node ip
            .WithRetryPolicy(DowngradingConsistencyRetryPolicy.Instance)
            .WithReconnectionPolicy(new FixedReconnectionPolicy(0, 5000, 2 * 60000, 60 * 60000))
            .WithQueryTimeout(600000)     //Timeout specified in milliseconds.  //10 min = 600000
            .Build();
ISession session = cluster.Connect(KeySpace);

Upvotes: 1

Views: 496

Answers (1)

Samyel
Samyel

Reputation: 139

The Cassandra driver supports using the same connection for more than one query, but the defaults are set very low on the C# driver - at least compared to the drivers I've used in the past.

YMMV, but you can tune the connection options (documented here) so that connections are used for multiple requests. You can also set the maximum number of created connections to prevent the application from stopping.

This will obviously bottleneck your application if more requests are coming in than can be dealt with in time. You will need to either optimise or use additional servers in the case that this is a problem.

Upvotes: 0

Related Questions