Balakrishna D
Balakrishna D

Reputation: 464

How to convert JavaRDD<<List<String>> to JavaPairRDD<String, String>

I have a JavaRDD when I print it my data looks like this [[String1,String2,String3],[String4],[String5,String6],[String7,String8,String9]]

Each String is in turn a pipe separated strings. I can split each string to form a key and value.

How can I convert this RDD to a JavaPairRDD?

Upvotes: 0

Views: 2863

Answers (2)

Rajeev Rathor
Rajeev Rathor

Reputation: 1922

Follow below code snippet for transforming JavaRDD<K> into JavaPairRDD<K,V>

JavaPairRDD<Integer, Sensor> deviceRdd = sensorRdd.mapToPair(new PairFunction<Sensor, Integer, Sensor>() {

            public Tuple2<Integer, Sensor> call(Sensor sensor) throws Exception {
                Tuple2<Integer, Sensor>  tuple = new Tuple2<Integer, Sensor>(Integer.parseInt(sensor.getsId().trim()), sensor);
                return tuple;
            }
        });

Upvotes: 0

Yuan JI
Yuan JI

Reputation: 2995

Assuming you have such data in JavaRDD<List<String>>:

List_0: ["sub10~sub11~sub12","sub20~sub21~sub22","sub30~sub31~sub32"]
List_1: ["sub40~sub41~sub42"]

Where ~ is the separator.

And you want to flat the lists and group the first and the third sub string with | as the key for each input string, then store pairs in JavaPairRDD<String,String>:

key: "sub10|sub12"    value: "sub10~sub11~sub12"

You could achieve this by using flatMap and then mapToPair:

rdd.flatMap(new FlatMapFunction<List<String>,String>() {
    public Iterable<String> call(List<String> li) throws Exception {
        return li;
    }
}).mapToPair(new PairFunction<String,String,String>() {
    public Tuple2<String, String> call(String s) throws Exception {
        String[] ss = s.split("~");
        return new Tuple2<String,String>(ss[0] + "|" + ss[2], s);
    }
});

Upvotes: 1

Related Questions