Sathiya Narayanan
Sathiya Narayanan

Reputation: 641

Get the key of a JavaPairRDD

I have a JavaPairRDD < String, Iterable < Tuple2 < String, String>>>

I printed it in a file and the content is

(ABC,[(ABC,1)])
(BBC,[(BBC,1)])
(CBD,[(CBD,1)])
(BBD,[(BBD,1)])
(ACD,[(ACD,1)])

Now I want to take only the strings ABC, BBC, CBD, BBD, ACD to a JavaRDD and print them in a file

Till now I am able to print them in a console using foreach

foreach(new VoidFunction<Tuple2<String, Iterable<Tuple2<String, String>>>>() {

            @Override
            public void call(Tuple2<String, Iterable<Tuple2<String, String>>> t) throws Exception {
                // TODO Auto-generated method stub
                System.out.println(t._1);
            }
        });

I want to do the same in a file. I am new to spark and so don't know how I could acheive this. Any help would be much appreciated. Thanks in advance.

Upvotes: 0

Views: 1506

Answers (1)

Alex Karpov
Alex Karpov

Reputation: 564

Please, try:

pairRdd.keys().coalesce(1).saveAsTextFile("some_path");

Upvotes: 0

Related Questions