Reputation: 53
I have rdd with key-value pair in Scala. I want to form rdd in such a way that it will be (key , tuple(values))
.
I have tried using map but did not work. If it is pyspark then I would have used
map(lambda x : x[0] , list(x[1:]))
(a,1,2,3,4), (b,4,5,6),(c,1,3)
to [a,(1,2,3,4)], [b,(4,5,6)], [c,(1,3)]
Upvotes: 3
Views: 922
Reputation: 22635
In Scala tuples are hard to handle in a generic way (it will change in Scala 3), so the most straightforward solution for you would be just to create helper object with overloaded function:
object TupleUtil {
def splitHead[K,V](t: (K,V,V)): (K,(V,V)) = t._1 -> (t._2, t._3)
def splitHead[K,V](t: (K,V,V,V)): (K,(V,V,V)) = t._1 -> (t._2, t._3, t._4)
def splitHead[K,V](t: (K,V,V,V,V)): (K,(V,V,V,V)) = t._1 -> (t._2, t._3, t._4, t._5)
//etc up to 22
}
Or if you can use shapeless, then you could simply do:
import shapeless.syntax.std.tuple._
(t.head, t.tail)
To use it, simply add it to your build.sbt
:
libraryDependencies += "com.chuusai" %% "shapeless" % "2.3.3"
Upvotes: 4