user3375659
user3375659

Reputation: 65

Bulk Loading in Cassandra

I have a requirement where I need to do load bulk data in Cassandra. I did google search and found that SSTABLE loader can be used to load bulk data in cassandra. I am using DataStax and wanted to know whether I can use Apache Sqoop, my bulk data is in CSV format. If I can use Apache SQOOp can someone please give the syntax, of how to load bulk data in CSV format using sqoop

Upvotes: 1

Views: 1867

Answers (1)

Daniel S.
Daniel S.

Reputation: 3514

Scoop is for importing from JDBC stores (relational databases), not CSV files, so you can't use it.

If your file is small (i.e. fits on one machine), you should consider importing using CQL shell COPY FROM. First create your tables to match the schema you're importing, and then run this statement from the CQL shell (use your own columns, filename and delimiter):

COPY mytable(col1, col2, col3) FROM ('myfile.csv') WITH DELIMITER=',';

And then you're done. So this is the easy way.

Now for the SSTableLoader, last I've heard, if you want to use it, you'll need to write a custom java program that will convert your file into an SSTable. From what you've described, this may not be the best approach for your scenario. Still, if your CSV file is really huge, here's a blog post describing the steps involved (this is a complex walkthrough, so I'm not going to repeat it here).

Upvotes: 2

Related Questions