Reputation: 22709
Why does this code produces this exception? How can I avoid it
SparkConf conf = new SparkConf().setAppName("startingSpark").setMaster("local[*]");
JavaSparkContext sc = new JavaSparkContext(conf);
List<Tuple2<Integer, Integer>> visitsRaw = new ArrayList<>();
visitsRaw.add(new Tuple2<>(4, 18));
visitsRaw.add(new Tuple2<>(6, 4));
visitsRaw.add(new Tuple2<>(10, 9));
List<Tuple2<Integer, String>> usersRaw = new ArrayList<>();
usersRaw.add(new Tuple2<>(1, "John"));
usersRaw.add(new Tuple2<>(2, "Bob"));
usersRaw.add(new Tuple2<>(3, "Alan"));
usersRaw.add(new Tuple2<>(4, "Doris"));
usersRaw.add(new Tuple2<>(5, "Marybelle"));
usersRaw.add(new Tuple2<>(6, "Raquel"));
JavaPairRDD<Integer, Integer> visits = sc.parallelizePairs(visitsRaw);
JavaPairRDD<Integer, String> users = sc.parallelizePairs(usersRaw);
JavaPairRDD<Integer, Tuple2<Integer, String>> joinedRdd = visits.join(users);
joinedRdd.foreach(System.out::println);
sc.close();
Upvotes: 0
Views: 50
Reputation: 7207
Clause 'System.out::println' is not serializable, can be changed to:
joinedRdd.foreach(v->System.out.println(v));
Or for print values on Driver node such construction can be used:
joinedRdd.collect().forEach(System.out::println);
Upvotes: 1