Reputation: 115
I have one API, is an flask application with python deployed on AWS EC2. Some endpoints need to connect on AWS Keyspace for make a query. But the method cluster.connect()
is too slow, takes 5 seconds for connect and then run the query.
What I did to solve it, was to start a connection when the application starts (when a commit is done on the master branch, I'm using CodePipeline), and then the connection is open all the time.
I didn't find anything in the python cassandra driver documentation against this, is there any potential problem with this solution that I found?
Upvotes: 2
Views: 659
Reputation: 812
Could you provide the current connection configuration?
Amazon Keyspaces uses Transport Layer Security (TLS) communication by default. If your not providing the cert on connection, adding it could help speed things up. For a complete example check out Keyspaces Python Sample
You can also try disabling the following options which should provide quicker times for initial connection.
schema_metadata_enabled = False
token_metadata_enabled = False
from cassandra.cluster import Cluster
from ssl import SSLContext, PROTOCOL_TLSv1_2 , CERT_REQUIRED
from cassandra.auth import PlainTextAuthProvider
import boto3
from cassandra_sigv4.auth import SigV4AuthProvider
ssl_context = SSLContext(PROTOCOL_TLSv1_2)
ssl_context.load_verify_locations('path_to_file/sf-class2-root.crt')
ssl_context.verify_mode = CERT_REQUIRED
boto_session = boto3.Session()
auth_provider = SigV4AuthProvider(boto_session)
cluster = Cluster(['cassandra.us-east-2.amazonaws.com'], ssl_context=ssl_context, auth_provider=auth_provider,
port=9142)
cluster.schema_metadata_enabled = False
cluster.token_metadata_enabled = False
session = cluster.connect()
r = session.execute('select * from system_schema.keyspaces')
print(r.current_rows)
Upvotes: 3
Reputation: 87224
It's a recommended way - open connection at start and keep it (and have one connection per application). Opening connection to a Cassandra cluster is an expensive operation, because besides connection itself, driver discovers the topology of the cluster, calculate token ranges, and many other things. Usually, for "normal" Cassandra this shouldn't be very long (but still expensive), and AWS's emulation may add an additional latency on top of it.
Upvotes: 3