sirdan13
sirdan13

Reputation: 103

Converting JavaRDD<List<String>> to JavaPairRDD<String, Integer>

I have a JavaRDD<List<String>> and I want it to become a JavaPairRDD<String, Integer>, where the String is each element included in the lists of the original JavaRDD, and the Integer is a constant (1). Is it possible to do something like that? PS: already checked this question, but didn't help me.

Upvotes: 0

Views: 1142

Answers (2)

ernest_k
ernest_k

Reputation: 45339

You can use:

JavaRDD<List<String>> listRdd = null; //assign
JavaPairRDD<String, Integer> rdd = listRdd.flatMap(list -> list)
     .mapToPair(string -> new Tuple2<String, Integer>(string, 1));

Upvotes: 1

Igor Berman
Igor Berman

Reputation: 1532

Please use flatMapToPair

        JavaRDD<List<String>> rdd = ...;

        JavaPairRDD<String, Integer> flatMapToPair = rdd.flatMapToPair(new PairFlatMapFunction<List<String>, String, Integer>() {

            @Override
            public Iterable<Tuple2<String, Integer>> call(List<String> t) throws Exception {
                List<Tuple2<String, Integer>> result = new ArrayList<>();
                for (String str : t) {
                    result.add(new Tuple2<>(str, 1));
                }
                return result;
            }
        });

Upvotes: 1

Related Questions