Reputation: 57
hello everyone I am very new in Hadoop. This is my first program and I need help solving the below errors.
When I put my file into HDFS as directly without using hdfs://localhost:9000/ then I get error message dir not exist.
So I put file into hdfs by using following way
hadoop fs -put file.txt hdfs://localhost:9000/sawai.txt
after this file is loaded into the HDFS like this :
OK, then I tried to run my program of wordcount jar file like this:
hadoop jar wordcount.jar hdp.WordCount sawai.txt outputdir
and I get following error message:
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://localhost:9000/user/hadoop_usr/sawai.txt
then i try another way, i try to specify the hdfs path like this.
hadoop jar wordcount.jar hdp.WordCount hdfs://localhost:9000/sawai.txt hdfs://localhost:9000/outputdir
and i get following error message:
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://localhost:9000/sawai.txt already exists
at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:131) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:268)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:870)
at hdp.WordCount.run(WordCount.java:40)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at hdp.WordCount.main(WordCount.java:17)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
I read out many articles and they suggest me to change output directory name every time, I applied this way but its not working in my case and its seem like problem is in defining source file name on which we want to perform operations.
What is causing the exception and how can I solve it?
Upvotes: 0
Views: 438
Reputation: 513
Have you tried hadoop jar wordcount.jar hdp.WordCount /sawai.txt /outputdir ? HDFS prefers FULL paths.
Also,I have never had to prepend "hdfs://localhost:/" to upload a file to HDFS or run a jar. Usually you can just reference the full file path and its fine. Maybe try it without that prepended?
If that does not fix it, It is best practice to increasing the replication factor to three. Also the file size is significantly smaller than the block size and that can become problematic. Cloudera Advice for file and block size http://blog.cloudera.com/blog/2009/02/the-small-files-problem
Upvotes: 1
Reputation: 29165
I havent seen your complete program with input/output....
I think sawai.txt is your input file which you want to count number of words. why are you copying that to output ?
However, See this example add it to driver. If path exists then it deletes. so you wont get FileAlreadyExistsException
/*Provides access to configuration parameters*/
Configuration conf = new Configuration();
/*Creating Filesystem object with the configuration*/
FileSystem fs = FileSystem.get(conf);
/*Check if output path (args[1])exist or not*/
if(fs.exists(new Path(args[1]))){
/*If exist delete the output path*/
fs.delete(new Path(args[1]),true);
}
Upvotes: 1