Reputation: 2368
I am learning Apache Spark. Given such an implementation of spark using java below, I am confused about some details about it.
public class JavaWordCount {
public static void main(String[] args) throws Exception {
if (args.length < 2) {
System.err.println("Usage: JavaWordCount <master> <file>");
System.exit(1);
}
JavaSparkContext ctx = new JavaSparkContext(args[0], "JavaWordCount",
System.getenv("SPARK_HOME"), System.getenv("SPARK_EXAMPLES_JAR"));
JavaRDD<String> lines = ctx.textFile(args[1], 1);
JavaRDD<String> words = lines.flatMap(new FlatMapFunction<String, String>() {
public Iterable<String> call(String s) {
return Arrays.asList(s.split(" "));
}
});
JavaPairRDD<String, Integer> ones = words.map(new PairFunction<String, String, Integer>() {
public Tuple2<String, Integer> call(String s) {
return new Tuple2<String, Integer>(s, 1);
}
});
JavaPairRDD<String, Integer> counts = ones.reduceByKey(new Function2<Integer, Integer, Integer>() {
public Integer call(Integer i1, Integer i2) {
return i1 + i2;
}
});
List<Tuple2<String, Integer>> output = counts.collect();
for (Tuple2 tuple : output) {
System.out.println(tuple._1 + ": " + tuple._2);
}
System.exit(0);
}
}
According to my comprehension, begin in line 12, it passed an anonymous class FlatMapFunction
into the lines.flatMap()
as an argument. Then what does the String s
mean? It seems that it doesn't pass an created String s
as an argument, then how will the FlatMapFunction<String,String>(){}
class works since no specific arguments are passed into?
Upvotes: 0
Views: 429
Reputation: 509
All your are doing here is passing a FlatMapFunction as argument to the flatMap method; your passed FlatMapFunction overrides call(String s):
JavaRDD<String> words = lines.flatMap(new FlatMapFunction<String, String>()
{
public Iterable<String> call(String s)
{
return Arrays.asList(s.split(" "));
}
});
The code implementing lines.flatMap could look like this for instance:
public JavaRDD<String> flatMap(FlatMapFunction<String, String> map)
{
String str = "some string";
Iterable<String> it = map.call(str);
// do stuff with 'it'
// return a JavaRDD<String>
}
Upvotes: 3
Reputation: 20520
The anonymous class instance you're passing is overriding the call(String s)
method. Whatever is receiving this anonymous class instance is something that wants to make use of that call()
method during its execution: it will be (somehow) constructing strings and passing them (directly or indirectly) to the call()
method of whatever you've passed in.
So the fact that you're not invoking the method you've defined isn't a worry: something else is doing so.
This is a common use case for anonymous inner classes. A method m()
expects to be passed something that implements the Blah
interface, and the Blah
interface has a frobnicate(String s)
method in it. So we call it with
m(new Blah() {
public void frobnicate(String s) {
//exciting code goes here to do something with s
}
});
and the m
method will now be able to take this instance that implements Blah
, and invoke frobnicate()
on it.
Perhaps m
looks like this:
public void m(Blah b) {
b.frobnicate("whatever");
}
Now the frobnicate()
method that we wrote in our inner class is being invoked, and as it runs, the parameter s
will be set to "whatever"
.
Upvotes: 3