Reputation: 795
I have a JavaRDD<Tuple2<String, String>>
and need to transform it to JavaPairRDD<String, String>
. Currently I am doing it by simply writing map function that just returns the input tuple as is. But I wonder if there is a better way?
Upvotes: 13
Views: 20421
Reputation: 1932
Try this to transform JavaRDD into JavaPairRDD. For me It is working perfectly.
JavaRDD<Sensor> sensorRdd = lines.map(new SensorData()).cache();
// transform data into javaPairRdd
JavaPairRDD<Integer, Sensor> deviceRdd = sensorRdd.mapToPair(new PairFunction<Sensor, Integer, Sensor>() {
public Tuple2<Integer, Sensor> call(Sensor sensor) throws Exception {
Tuple2<Integer, Sensor> tuple = new Tuple2<Integer, Sensor>(Integer.parseInt(sensor.getsId().trim()), sensor);
return tuple;
}
});
Upvotes: 2
Reputation: 893
For reverse conversion, this seems to work:
JavaRDD.fromRDD(JavaPairRDD.toRDD(rdd), rdd.classTag());
Upvotes: 4
Reputation: 1101
Alternatively you can call mapToPair(..)
on your instance of org.apache.spark.api.java.JavaRDD
.
Upvotes: 1
Reputation: 1499
Try this example:
JavaRDD<Tuple2<Integer, String>> mutate = mutateFunction(rdd_world); //goes to a method that generates the RDD with a Tuple2 from a rdd_world RDD
JavaPairRDD<Integer, String> pairs = JavaPairRDD.fromJavaRDD(mutate);
Upvotes: 2