Reputation: 1159
I am writing generic mathematical operation functions that work on Spark RDDs of numeric values.
For multiplication, I have something that looks like this:
def mult(rdd1: RDD[AnyVal], rdd2: RDD[AnyVal]): RDD[AnyVal] = {
rdd1.zip(rdd2).map(row => row._1 * row._2)
}
*
is not a member of AnyVal, so this doesn't compile. Is there something I could do to make this work?
Upvotes: 0
Views: 927
Reputation: 3081
What about using Numeric
for numeric types?
this should work:
def mult[X:Numeric](rdd1: RDD[X], rdd2: RDD[X]): RDD[X] = {
import Numeric.Implicits._
rdd1.zip(rdd2).map(row => row._1 * row._2)
}
If you want to be able to multiply anything with anything, then you need to tell the compiler how to do it.
To do so, let's declare a trait that describes the functionality:
trait Multiplier[A, B, C] {
def multiply(a: A, b: B): C
}
Now you can define a generic function multiply that lifts the multiplication to other types (I will use Seq
you can use RDD
):
def multiply[A,B,C](as:Seq[A],bs:Seq[B])(implicit multiplier: Multiplier[A,B,C]): Seq[C] =
as zip bs map ( p => multiplier.multiply(p._1, p._2))
Now let's tell the compiler how to multiply an Int
with a String
(Scala can multiply a String
with an Int
, but not the other way around.) So let's define the multiplier:
implicit object IntStringMultipler extends Multiplier[Int, String, Seq[String]] {
override def multiply(a: Int, b: String): Seq[String] = (1 to a) map (_ => b)
}
To make it more interesting, 2 * "x"
will be Seq("x", "x")
not "xx"
like Scala's own "x" * 2
.
Now we can call: multiply(Seq(2, 3), Seq("a", "b"))
to get List(Vector("a", "a"), Vector("b", "b", "b"))
Upvotes: 2