Reputation: 41
I've a big problem!
I have an RDD[(Int, Vector)]
, where the Int
is a sort of label.
For example :
(0, (a,b,c) );
(0, (d,e,f) );
(1, (g,h,i) )
etc...
Now, i need to use this RDD(I call it myrdd ) like this :
myrdd.map{ case(l,v) =>
myrdd.map { case(l_, v_) =>
compare(v, v_)
}
}
Now, I know that it's impossible in spark to use RDD nested.
I can bypass the problem using an Array. But for my problem i can't use Array, or anything that goes in memory.
How could I resolve my problem WITHOUT USING ARRAY?
Thanks in advance!!!
Upvotes: 0
Views: 224
Reputation: 67135
cartesian
sounds like it should work:
myrdd.cartesian(myrdd).map{
case ((_,v),(_,v_)) => compare(v,v_)
}
Upvotes: 2