Reputation: 7326
I am implementing a customized version of WordCount.java
in Hadoop
where I am interested in outputting the word counts per node.
For example, given text:
FindMe FindMe ..... .... .... .. more big text ... FindMe FindMe FindMe
FindMe node01: 2
FindMe node02: 3
Here is a snippet from my Mapper
String searchString = "FindMe";
while (itr.hasMoreTokens()) {
String token = itr.nextToken();
if (token.equals(searchString)) {
word.set(token);
context.write(word, one);
}
}
This code outputs
FindMe n
where n is the total number of occurrences in all the input.
How can I output the count for each node along with some kind of identifier for this node like the example I provided above?
Upvotes: 1
Views: 130
Reputation: 1811
You can output string + hostname at mapper so that you can have word count for each node.
java.net.InetAddress localMachine = java.net.InetAddress.getLocalHost();
String computerName = localMachine.getHostName();
String searchString = "FindMe";
while (itr.hasMoreTokens()) {
String token = itr.nextToken();
if (token.equals(searchString)) {
word.set(token+" "+computerName);
context.write(word, one);
}
}
Upvotes: 2