Parth Mondal
Parth Mondal

Reputation: 23

Cassandra Python driver error(Memory leak while creating sessions)

I am changing the original post to memory leak, as what i have observed that cassandra python driver do not release sessions from memory. And during heavy inserts its eat up all the memory (Thus crashes cassandra as not enough room left for GC).

This was raised earlier but i see the issue in latest drivers as well.

https://github.com/datastax/python-driver/pull/131

     In [2]: cassandra.__version__
     Out[2]: '2.1.4'

class SimpleClient(object):
session = None

def connect(self, nodes):
    cluster = Cluster(nodes)
    metadata = cluster.metadata
    self.session = cluster.connect()
    logging.info('Connected to cluster: ' + metadata.cluster_name)
    for host in metadata.all_hosts():
        logging.info('Datacenter: %s; Host: %s; Rack: %s', host.datacenter, host.address, host.rack)
        print ("Datacenter: %s; Host: %s; Rack: %s"%(host.datacenter, host.address, host.rack))

def close(self):
    self.session.cluster.shutdown()
    logging.info('Connection closed.')

def main():
    logging.basicConfig()
    client = SimpleClient()
    client.connect(['127.0.0.1'])
    client.close()

if __name__ == "__main__":
     count = 0
     while count != 1:
           main()
          time.sleep(1)

If any one have found the solution of it please share.

Upvotes: 0

Views: 1033

Answers (2)

Adam Holmberg
Adam Holmberg

Reputation: 7365

There was an issue where registering shutdown hooks was keeping Cluster references around. This is resolved in driver version 3.4.0.

JIRA

PR

Upvotes: 1

RussS
RussS

Reputation: 16576

I think GC is very likely. Depending on how much data you are inserting (and how fast) you could be causing a rather long pause on the C* side. The python driver keeps a constant connection to the C* database which it occasionally pings for cluster state information. The error you are seeing is this connection failing to actually receive data.

You should be able to see in your Cassandra logs a record of each GC and it's duration.

Upvotes: 1

Related Questions