Reputation: 243
I have created jar that runs the mapReduce and generates the output at some directory. I need to read data from output data from output dir from my java code which not runs in hadoop environment without copying it into local directory. I am using ProcessBuilder to run Jar.can any one help me..??
Upvotes: 1
Views: 318
Reputation: 34184
What's the problem in reading HDFS data using HDFS API??
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
Configuration conf = new Configuration();
conf.addResource(new Path("/hadoop/projects/hadoop-1.0.4/conf/core-site.xml"));
conf.addResource(new Path("/hadoop/projects/hadoop-1.0.4/conf/hdfs-site.xml"));
FileSystem fs = FileSystem.get(conf);
FSDataInputStream inputStream = fs.open(new Path("/mapout/input.txt"));
System.out.println(inputStream.readLine());
}
Your program might be running out of your hadoop cluster but hadoop daemons must be running.
Upvotes: 1
Reputation: 603
You can write the following code to read the output of the job within your MR driver code.
job.waitForCompletion(true);
FileSystem fs = FileSystem.get(conf);
Path[] outputFiles = FileUtil.stat2Paths(fs.listStatus(output,new OutputFilesFilter()));
for (Path file : outputFiles ) {
InputStream is = fs.open(file);
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
---
---
}
Upvotes: 1