Luke
Luke

Reputation: 7089

Mapping cassandra row to parametrized type in Spark RDD

I'm trying to map a cassandra row to a parametrized type using the spark-cassandra-connector. I've been trying to define the mapping using an implicitly defined columnMapper, thusly:

class Foo[T<:Bar:ClassTag:RowReaderFactory] {
  implicit object Mapper extends JavaBeanColumnMapper[T](
    Map("id" -> "id",
        "timestamp" -> "ts"))

  def doSomeStuff(operations: CassandraTableScanRDD[T]): Unit = {
    println("do some stuff here")
  }
}

However, I'm running into the following error, which I believe is due to the fact that I am passing in a RowReaderFactory and not properly specifying the mapping for the RowReaderFactory. Any idea how to specify the mapping information for a RowReaderFactory?

Exception in thread "main" java.lang.IllegalArgumentException: Failed to map constructor parameter timestamp in Bar to a column of MyNamespace
    at com.datastax.spark.connector.mapper.DefaultColumnMapper$$anonfun$4$$anonfun$apply$1.apply(DefaultColumnMapper.scala:78)
    at com.datastax.spark.connector.mapper.DefaultColumnMapper$$anonfun$4$$anonfun$apply$1.apply(DefaultColumnMapper.scala:78)
    at scala.Option.getOrElse(Option.scala:120)
    at com.datastax.spark.connector.mapper.DefaultColumnMapper$$anonfun$4.apply(DefaultColumnMapper.scala:78)
    at com.datastax.spark.connector.mapper.DefaultColumnMapper$$anonfun$4.apply(DefaultColumnMapper.scala:76)
    at scala.collection.TraversableLike$WithFilter$$anonfun$map$2.apply(TraversableLike.scala:722)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at scala.collection.TraversableLike$WithFilter.map(TraversableLike.scala:721)
    at com.datastax.spark.connector.mapper.DefaultColumnMapper.columnMapForReading(DefaultColumnMapper.scala:76)
    at com.datastax.spark.connector.rdd.reader.GettableDataToMappedTypeConverter.<init>(GettableDataToMappedTypeConverter.scala:56)
    at com.datastax.spark.connector.rdd.reader.ClassBasedRowReader.<init>(ClassBasedRowReader.scala:23)
    at com.datastax.spark.connector.rdd.reader.ClassBasedRowReaderFactory.rowReader(ClassBasedRowReader.scala:48)
    at com.datastax.spark.connector.rdd.reader.ClassBasedRowReaderFactory.rowReader(ClassBasedRowReader.scala:43)
    at com.datastax.spark.connector.rdd.CassandraTableRowReaderProvider$class.rowReader(CassandraTableRowReaderProvider.scala:48)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD.rowReader$lzycompute(CassandraTableScanRDD.scala:59)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD.rowReader(CassandraTableScanRDD.scala:59)
    at com.datastax.spark.connector.rdd.CassandraTableRowReaderProvider$class.verify(CassandraTableRowReaderProvider.scala:147)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD.verify(CassandraTableScanRDD.scala:59)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD.getPartitions(CassandraTableScanRDD.scala:143)

Upvotes: 2

Views: 1085

Answers (2)

Josh Marcus
Josh Marcus

Reputation: 1739

You can define that implicit in the companion object of Foo, as follows:

object Foo {
  implicit object Mapper extends JavaBeanColumnMapper[T](
    Map("id" -> "id",
        "timestamp" -> "ts"))
}

Scala will look in the companion object of a class when trying to find an implicit instance for that class. You can define it in the scope where the implicit is needed, if you want, but you probably want to add in the companion object so you don't need to repeat it whenever it is necessary.

Upvotes: 1

Luke
Luke

Reputation: 7089

Turns out that the columnMapper has to be created in the scope where the instance of Foo is created, not in Foo itself.

Upvotes: 1

Related Questions