Reputation: 571
I have two tables in Cassandra:
CREATE TABLE table1 (
name text PRIMARY KEY,
grade text,
labid List<int>);
CREATE TABLE table2(
name text PRIMARY KEY,
deptid List<int>
grade text,);
for example:
val result: RDD[String, String, List[Int]] = myFunction();
result.saveToCassandra(keyspace, table1)
It is working fine. but in case of using below line:
result.saveToCassandra(keyspace, table2)
m getting this type of error : com.datastax.spark.connector.types.TypeConversionException: Cannot convert object test_data of type class java.lang.String to List[AnyRef]
Is there any solution using SomeColumns which satisfy the both tables[we don't know which table will be executed]. eg:
result.saveToCassandra(keyspace, table, SomeColumns(....))?
Upvotes: 1
Views: 531
Reputation: 3260
By default the dataframe schema only cares about position, not column name, so if your c* table has a different column order, you will get incorrect writes. The solution is like you said, to use SomeColumns
.
val columns = dataFrame.schema.map(_.name: ColumnRef)
dataFrame.rdd.saveToCassandra(keyspaceName, tableName, SomeColumns(columns: _*))
Now the dataframe columns will be written to c* using their name, not position.
Upvotes: 1
Reputation: 53819
You arguments should be in different order because the tables have different column types:
val result: RDD[String, String, List[Int]] = myFunction();
val reorder: RDD[String, List[Int], String] = result.map(r => r._1, r._3, r._2)
reorder.saveToCassandra(keyspace, table2)
Upvotes: 0