anonymous123
anonymous123

Reputation: 1285

Need assistance with running the WordCount.java provided by Cloudera

Hey guys so I am trying to run the WordCount.java example, provided by cloudera. I ran the command below and am getting the exception that I have put below the command. So do you have any suggestions on how to proceed. I have gone through all the steps provided by cloudera.

Thanks in advance.

hadoop jar ~/Desktop/wordcount.jar org.myorg.WordCount ~/Desktop/input
~/Desktop/output

Error:

ERROR security.UserGroupInformation: PriviledgedActionException
as:root (auth:SIMPLE)
cause:org.apache.hadoop.mapred.InvalidInputException: Input path does
not exist: hdfs://localhost/home/rushabh/Desktop/input
Exception in thread "main"
org.apache.hadoop.mapred.InvalidInputException: Input path does not
exist: hdfs://localhost/home/rushabh/Desktop/input
        at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:194)
        at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:205)
        at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:977)
        at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:969)
        at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:416)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1248)
        at org.myorg.WordCount.main(WordCount.java:55)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:197)

Upvotes: 2

Views: 18444

Answers (6)

anonymous123
anonymous123

Reputation: 1285

So I added the input folder to HDFS using the following command

hadoop dfs -put /usr/lib/hadoop/conf input/

Upvotes: 1

somnathchakrabarti
somnathchakrabarti

Reputation: 3086

The error clearly states that your input path is local. Please specify the input path to something on HDFS rather than on local machine. My guess

hadoop jar ~/Desktop/wordcount.jar org.myorg.WordCount ~/Desktop/input
~/Desktop/output

needs to be changed to

hadoop jar ~/Desktop/wordcount.jar org.myorg.WordCount <hdfs-input-dir>
<hdfs-output-dir>

NOTE: To run MapReduce job, the input directory should be in HDFS, not local.

Hope this helps.

Upvotes: 1

Hisham Muneer
Hisham Muneer

Reputation: 8742

Your input and output files should be at hdfs. Atleast input should be at hdfs.

use the following command:

hadoop jar ~/Desktop/wordcount.jar org.myorg.WordCount hdfs:/input

hdfs:/output

To copy a file from your linux to hdfs use the following command:

hadoop dfs -copyFromLocal ~/Desktop/input hdfs:/

and check your file using :

hadoop dfs -ls hdfs:/

Hope this will help.

Upvotes: 2

Rahul Mahajan
Rahul Mahajan

Reputation: 109

When I tried to run wordcount MapReduce code, I was getting error as:

ERROR security.UserGroupInformation: PriviledgedActionException as:hduser cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hduser/wordcount

I was trying to execute the wordcount MapReduce java code with input and output path as /user/hduser/wordcount and /user/hduser/wordcount-output. I just added 'fs.default.name' from core-site.xml before this path and it ran perfectly.

Upvotes: 1

user1617057
user1617057

Reputation: 1

Check the ownership of the files in hdfs to ensure that the owner of the job (root) has read privileges on the input files. Cloudera provides an hdfs viewer that you can use to view the filespace; open a web browser to either localhost:50075 or {fqdn}:50075 and click on "Browse the filesystem" to view the Input directory and input files. Check the ownership flags; just like *nix filesystem.

Upvotes: 0

Stephen C
Stephen C

Reputation: 718896

The error message says that this file does not exist: "hdfs://localhost/home/rushabh/Desktop/input".

Check that the file does exist at the location you've told it to use.

Check the hostname is correct. You are using "localhost" which most likely resolves to a loopback IP address; e.g. 127.0.0.1. That always means "this host" ... in the context of the machine that you are running the code on.

Upvotes: 1

Related Questions