Vijay Kambhampati
Vijay Kambhampati

Reputation: 57

Spark Full Rdd joinWithCassandraTable java.lang.IllegalArgumentException: requirement failed: Invalid row size: instead of

Here's My Sample code

# Cassandra Table Definition 

custId: text PRIMARY KEY
custName: text
custAddress: text

val testDF = Seq(("event-01", "cust-01"), ("event-02", "cust-02")).toDF(("eventId", "custId"))

val resultRdd = testDF
    .rdd
    .leftJoinWithCassandraTable(
      keyspaceName = "my_key_space",
      tableName = "cust_table",
      selectedColumns = AllColumns,
      joinColumns = SomeColumns("custId")
    )
    .map { case (sparkRow, cassandraRow) =>
      val resultStruct = cassandraRow
        .map(r => Row.fromSeq(r.columnValues))
        .orNull
      Row.fromSeq(sparkRow.toSeq :+ resultStruct)
    }

Upvotes: 2

Views: 435

Answers (1)

Alex Ott
Alex Ott

Reputation: 87119

You need to use .on(SomeColumns("custId")) right after leftJoinWithCassandraTable...

I have the blog post on the efficient join with Cadsandra, and it describes RDD API as well...

Upvotes: 2

Related Questions