Reputation: 63
I have and rdd like that :
Map(A -> Map(A1 -> 1))
Map(A -> Map(A2 -> 2))
Map(A -> Map(A3 -> 3))
Map(B -> Map(B1 -> 4))
Map(B -> Map(B2 -> 5))
Map(B -> Map(B3 -> 6))
Map(C -> Map(C1 -> 7))
Map(C -> Map(C2 -> 8))
Map(C -> Map(C3 -> 9))
I need to have the same rdd reduced by key and having as many values as it has previously:
Map(A -> Map(A1 -> 1, A2 -> 2, A3 -> 3))
Map(B -> Map(B1 -> 4, B2 -> 5, B3 -> 6))
Map(C -> Map(C1 -> 7, C2 -> 8, C3 -> 9))
I tried with a reduce:
val prueba = replacements_2.reduce((x,y) => x ++ y)
But only remains the value of the last element evaluated with the same key:
(A,Map(A3 -> 3))
(C,Map(C3 -> 9))
(B,Map(B3 -> 6))
Upvotes: 0
Views: 98
Reputation: 27373
I think you should model your data differently, your Map
approach seems a bit awkward. Why represent 1 entry by a Map
with 1 element? A Tuple2
is more suitable for this... Anyway, you need reduceByKey
. To do this, you first need to convert your rdd to a key-value RDD:
rdd
.map(m => (m.keys.head,m.values.head)) // create key-value RDD
.reduceByKey((a,b) => a++b) // merge maps
.map{case (k,v) => Map(k -> v)} // create Map again
Upvotes: 1