Reputation: 35
I want to import a csv file in cassandra using python script. I already know how to import csv in cassandra using the COPY command in the cqlsh.
My csv file with dtypes: timestamp, timestamp, decimal, decimal, decimal, decimal, decimal, decimal
I already have a code, and I am only lacking on how I would import the csv file inside cassandra
KEYSPACE = "test_cassandra"
cluster = Cluster(['127.0.0.1'])
session = cluster.connect()
log.info("creating keyspace . . .")
session.execute("""
CREATE KEYSPACE IF NOT EXISTS %s
WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'}
""" % KEYSPACE)
log.info("setting keyspace . . .")
session.set_keyspace(KEYSPACE)
log.info("creating table . . .")
session.execute("""
CREATE TABLE IF NOT EXISTS test_NYC_taxi (
pickup timestamp,
dropoff timestamp,
distance decimal,
fare decimal,
p_long decimal,
p_lat decimal,
d_long decimal,
d_lat decimal,
PRIMARY KEY(pickup, dropoff, distance));
""")
prepared = session.prepare("""
INSERT INTO test_nyc_taxi (pickup, dropoff, distance, fare, p_long, p_lat, d_long, d_lat)
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
""")
Any help would be very much appreciated. :)
Upvotes: 1
Views: 3911
Reputation: 57748
You're on the right path. The only things you need to add, is a file reader (I saved your data as a .csv
file and got it to work), a for-loop to iterate through it, and then an import of the datetime
package to convert your dates for the prepared statement.
This worked for me:
with open("/home/aaron/Documents/stackoverflow/test_NYC_taxi.csv", "r") as fares:
for fare in fares:
columns=fare.split(",")
pickup=datetime.datetime.strptime(columns[0],"%Y-%m-%d").date()
dropoff=datetime.datetime.strptime(columns[1],"%Y-%m-%d").date()
distance=columns[2]
fare=columns[3]
p_long=columns[4]
p_lat=columns[5]
d_long=columns[6]
d_lat=columns[7]
session.execute(prepared, [pickup,dropoff,distance,fare,p_long,p_lat,d_long,d_lat])
#closing the file
fares.close()
#closing Cassandra connection
session.shutdown()
Upvotes: 1
Reputation: 1948
Open the csv file reader in python code. Iterate over the lines and call the insert statement. Please check below link.
https://stackoverflow.com/a/19828117/1358551
Upvotes: 0