Reputation: 23
I am changing the original post to memory leak, as what i have observed that cassandra python driver do not release sessions from memory. And during heavy inserts its eat up all the memory (Thus crashes cassandra as not enough room left for GC).
This was raised earlier but i see the issue in latest drivers as well.
https://github.com/datastax/python-driver/pull/131
In [2]: cassandra.__version__
Out[2]: '2.1.4'
class SimpleClient(object):
session = None
def connect(self, nodes):
cluster = Cluster(nodes)
metadata = cluster.metadata
self.session = cluster.connect()
logging.info('Connected to cluster: ' + metadata.cluster_name)
for host in metadata.all_hosts():
logging.info('Datacenter: %s; Host: %s; Rack: %s', host.datacenter, host.address, host.rack)
print ("Datacenter: %s; Host: %s; Rack: %s"%(host.datacenter, host.address, host.rack))
def close(self):
self.session.cluster.shutdown()
logging.info('Connection closed.')
def main():
logging.basicConfig()
client = SimpleClient()
client.connect(['127.0.0.1'])
client.close()
if __name__ == "__main__":
count = 0
while count != 1:
main()
time.sleep(1)
If any one have found the solution of it please share.
Upvotes: 0
Views: 1033
Reputation: 7365
There was an issue where registering shutdown hooks was keeping Cluster references around. This is resolved in driver version 3.4.0.
Upvotes: 1
Reputation: 16576
I think GC is very likely. Depending on how much data you are inserting (and how fast) you could be causing a rather long pause on the C* side. The python driver keeps a constant connection to the C* database which it occasionally pings for cluster state information. The error you are seeing is this connection failing to actually receive data.
You should be able to see in your Cassandra logs a record of each GC and it's duration.
Upvotes: 1