Reputation: 73
I have an existing dataset with around 700000 records in a CSV format. I have imported that data file into apache Cassandra table. The problem is
primary key. How can I automatically generate (upsert) uuid into my primary key column for all of my records? I am using Cassandra 3.10.
Upvotes: 1
Views: 1647
Reputation: 1385
Unfortunately, if you're using the COPY
command you don't really have any options for generating UUIDs
on the fly for your rows. I think you really have two options, both of which involve doing things programmatically to one extent or another:
UUID
to each row, writing out a new file with that additional field and UUID
value for each row. It should be pretty straightforward to process the file, line by line, and generate those values using a small Python script or something similar. Then you can use the COPY
command like before to import the data into Cassandra.COPY
command altogether and just write the code in Python (or Java or your language of choice) to read the file, parse each CSV line into values, generate a UUID for that row, and then INSERT
the data into Cassandra using the appropriate driver for the programming language you're using.If you decide to go with option 2, you'll find a list of the DataStax drivers for Cassandra towards the bottom of this page, along with documentation for how to use them. Hope that helps!
Upvotes: 2