How import dataset from S3 to cassandra?

Question

i Launch cluster spark cassandra with datastax dse in aws cloud. So my dataset storage in S3. But i don't know how transfer data from S3 to my cluster cassandra. Please help me

phact · Accepted Answer

The details depend on your file format and C* data model but it might look something like this:

Read the file from s3 into an RDD

val rdd = sc.textFile("s3n://mybucket/path/filename.txt.gz")
Manipulate the rdd
Write the rdd to a cassandra table:

rdd.saveToCassandra("test", "kv", SomeColumns("key", "value"))

How import dataset from S3 to cassandra?

Answers (2)

Related Questions