vijayinani
vijayinani

Reputation: 2634

Dataset[Seq[(String, String, String)]] to Dataset[(String, String, String)]

I am having a Cassandra table with the following structure:

CREATE TABLE myKeyspace.myTable (
  rowkey text,
  columnname text,
  columnvalue text,
  PRIMARY KEY (rowkey, columnname)
  )

I wish to insert data in the same with Spark Cassandra connector.

My Spark Dataset is of type Dataset[Seq[(String, String, String)]].

I want to convert it to Dataset[(String, String, String)] so that it can be inserted in the table using .rdd.saveToCassandra API.

Please assist on the conversion or is there a direct way to use the same Dataset[Seq[(String, String, String)]].

Upvotes: 0

Views: 214

Answers (1)

s.polam
s.polam

Reputation: 10382

Call flatMap on Dataset[Seq[(String, String, String)]], Check below & Please let me know if not working.

scala> dds
res124: org.apache.spark.sql.Dataset[Seq[(String, String, String)]] = [value: array<struct<_1:string,_2:string,_3:string>>]

scala> dds.printSchema
root
 |-- value: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- _1: string (nullable = true)
 |    |    |-- _2: string (nullable = true)
 |    |    |-- _3: string (nullable = true)


scala> dds.flatMap(d => d)
res126: org.apache.spark.sql.Dataset[(String, String, String)] = [_1: string, _2: string ... 1 more field]

Upvotes: 4

Related Questions