Reputation: 11
I am new to cassandra db and i am trying to import data from a csv file into cassandra, i executed the following command, to import the table, first i created the table using
create table cdma_mkt_bte (date_value timestamp primary key, region varchar, vendor varchar);
and then copied using
copy cdma_mkt_bte (date_value, region, vendor) from '/usr/share/dse/bin/cdma_mkt_bte' with HEADER = TRUE;
The problem is the table in the csv file has about 43,000 rows while only 211 rows are getting imported into cassandra, i looked at the 211 and 212th rows to see if there is strange going on, it seems to be ok. Can you please help me? and what are the other options to import a csv into the cassandra database.
Thank you! Would really appreciate the help!
Upvotes: 1
Views: 1752
Reputation: 2283
The options you can use for the COPY command are described in this doc:
Continue looking for a problem in the CSV file. Check for a hidden character at the end of a line. I think I remember a trailing blank space causing a problem. The problem might not have been located at exactly the location reported by the COPY command. I opened the CSV in Excel and that revealed the problem.
Upvotes: 0
Reputation: 6495
Your primary key seems to be date_value. All inserts and updates in cassandra are essentially upserts on a primary key. If two records have the same primary key, the second will overwrite the first. If the way to uniquely identify a record is date_value + region + vendor, then your schema should like:
create table cdma_mkt_bte (date_value timestamp, region varchar, vendor varchar,
primary key (date_value, region, vendor));
Is this possibly the reason you're not getting the expected number of records?
Upvotes: 1