Reputation: 610
I am new to Hadoop. I have done word count program with single input file and single output file. Now I want to take 2 files as input and write that output to a single file. I tried like this:
FileInputFormat.setInputPaths(conf, new Path(args[0]), new Path(args[1]));
FileOutputFormat.setOutputPath(conf, new Path(args[2]));
This is the command in terminal:
hadoop jar test.jar Driver /user/in.txt /user/sample.txt /user/out
When I run this, its taking sample.txt as output directory and says that :
Output directory hdfs://localhost:9000/user/sample.txt already exists
Can anyone help me with this?
Upvotes: 2
Views: 1523
Reputation: 2574
If you have all the input files in one folder as you have mentioned (/user
), the replace
hadoop jar test.jar Driver /user/in.txt /user/sample.txt /user/out
with this
hadoop jar test.jar Driver /user /user/out
This takes all the file inside /user
directory as input and outputs in user/out
folder in HDFS.
Upvotes: 1
Reputation: 1311
May be because it is taking Driver as your first argument. why don't you try like this.
hadoop jar test.jar /user/in.txt /user/sample.txt /user/out
Upvotes: 2