MChirukuri
MChirukuri

Reputation: 610

Word count program with two input files and single output file

I am new to Hadoop. I have done word count program with single input file and single output file. Now I want to take 2 files as input and write that output to a single file. I tried like this:

FileInputFormat.setInputPaths(conf, new Path(args[0]), new Path(args[1]));
FileOutputFormat.setOutputPath(conf, new Path(args[2]));

This is the command in terminal:

hadoop jar test.jar Driver /user/in.txt /user/sample.txt /user/out

When I run this, its taking sample.txt as output directory and says that :

Output directory hdfs://localhost:9000/user/sample.txt already exists

Can anyone help me with this?

Upvotes: 2

Views: 1523

Answers (2)

Rajesh N
Rajesh N

Reputation: 2574

If you have all the input files in one folder as you have mentioned (/user), the replace

hadoop jar test.jar Driver /user/in.txt /user/sample.txt /user/out

with this

hadoop jar test.jar Driver /user /user/out

This takes all the file inside /user directory as input and outputs in user/out folder in HDFS.

Upvotes: 1

salmanbw
salmanbw

Reputation: 1311

May be because it is taking Driver as your first argument. why don't you try like this.

hadoop jar test.jar /user/in.txt /user/sample.txt /user/out

Upvotes: 2

Related Questions