Reputation: 159
I have a Mapper (CustomMapper.class) and a Reducer (CustomReducer.class) class that I want to use in Spark. I could use them in Hadoop by creating a Job object and then setting the required Mapper and Reducer class as follows:
Configuration conf = new Configuration();
Job j = new Job(conf, "Adjacency Generator Job");
j.setMapperClass(CustomMapper.class);
j.setReducerClass(CustomReducer.class);
How can I achieve the same in Spark using Java? I have created a java RDD object as follows:
SparkConf conf=new SparkConf().setAppName("startingSpark").setMaster("local[*]");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> myFile = sc.textFile(args[0]);
I am not sure how to bind the Mapper and Reducer class in Spark using Java. Any help is appreciated.
Upvotes: 0
Views: 63
Reputation: 4048
Why do you want to do that? Spark internally creates a DAG of execution which are made up of transformations (like map, filters etc.) and actions (like collect, count etc.) which trigger the DAG. This is fundamentally different way of computation from map-reduce. So roughly your Mappers would correspond to the map actions on RDD and reducers to any of the aggregation functions. Please read up the docs to understand how Spark works.
Upvotes: 1