Reputation: 103
I have a JavaRDD<List<String>>
and I want it to become a JavaPairRDD<String, Integer>
, where the String is each element included in the lists of the original JavaRDD, and the Integer is a constant (1).
Is it possible to do something like that?
PS: already checked this question, but didn't help me.
Upvotes: 0
Views: 1142
Reputation: 45339
You can use:
JavaRDD<List<String>> listRdd = null; //assign
JavaPairRDD<String, Integer> rdd = listRdd.flatMap(list -> list)
.mapToPair(string -> new Tuple2<String, Integer>(string, 1));
Upvotes: 1
Reputation: 1532
Please use flatMapToPair
JavaRDD<List<String>> rdd = ...;
JavaPairRDD<String, Integer> flatMapToPair = rdd.flatMapToPair(new PairFlatMapFunction<List<String>, String, Integer>() {
@Override
public Iterable<Tuple2<String, Integer>> call(List<String> t) throws Exception {
List<Tuple2<String, Integer>> result = new ArrayList<>();
for (String str : t) {
result.add(new Tuple2<>(str, 1));
}
return result;
}
});
Upvotes: 1