Reputation: 85
I'm trying to run the PutMerge program from Hadoop in Action by Chuck Lam from Manning Publishing. It should be pretty simple, but I've had a bunch of problems trying to run it, and I've gotten to this error that I just can't figure out. Meanwhile, I'm running a basic wordcount program with no problem. I've spent about 3 days on this now. I've done all the research I possibly can on this, and I'm just lost.
Ya'll have any ideas?
Program:
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class PutMerge {
public static void main(String[] args) throws IOException {
Configuration conf = new Configuration();
FileSystem hdfs = FileSystem.get(conf);
FileSystem local = FileSystem.getLocal(conf);
Path inputDir = new Path(args[0]);
Path hdfsFile = new Path(args[1]);
try{
FileStatus[] inputFiles = local.listStatus(inputDir);
FSDataOutputStream out = hdfs.create(hdfsFile);
for (int i=0; i<=inputFiles.length; i++){
System.out.println(inputFiles[i].getPath().getName());
FSDataInputStream in = local.open(inputFiles[i].getPath());
byte buffer[] = new byte[256];
int bytesRead = 0;
while( (bytesRead = in.read(buffer)) > 0) {
out.write(buffer, 0, bytesRead);
}
in.close();
}
out.close();
} catch(IOException e){
e.printStackTrace();
}
}
}
Output Error from Eclipse:
2015-04-09 19:45:48,321 WARN util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FileSystem
at java.lang.ClassLoader.findBootstrapClass(Native Method)
at java.lang.ClassLoader.findBootstrapClassOrNull(ClassLoader.java:1012)
at java.lang.ClassLoader.loadClass(ClassLoader.java:413)
at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:344)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2563)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2574)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
at PutMerge.main(PutMerge.java:16)
About Eclipse:
Eclipse IDE for Java Developers
Version: Luna Service Release 2 (4.4.2)
Build id: 20150219-0600
About Hadooop:
Hadoop 2.6.0
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1
Compiled by jenkins on 2014-11-13T21:10Z
Compiled with protoc 2.5.0
From source with checksum 18e43357c8f927c0695f1e9522859d6a
This command was run using /usr/local/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar
About Java:
java version "1.8.0_31"
Java(TM) SE Runtime Environment (build 1.8.0_31-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.31-b07, mixed mode)
About my machine:
Mac OSX 10.9.5
Java Build Path - External JARs in Library:
Upvotes: 3
Views: 8056
Reputation: 97
My experience with Eclipse IDE :
My basic path for ubuntu installation is usr/hadoop/hadoop-2.7.1 (lets' say CONF) I've added two jar files,from CONF/share/hadoop/common/lib and from CONF/share/hadoop/common. And this is the java code (from the book Hadoop in Action) :
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class PutMerge {
public static void main(String[] args) throws IOException {
Configuration conf = new Configuration();
conf.set("fs.file.impl",org.apache.hadoop.fs.LocalFileSystem.class.getName());
org.apache.hadoop.fs.FileSystem hdfs = org.apache.hadoop.fs.FileSystem.get(conf);
FileSystem local = org.apache.hadoop.fs.FileSystem.getLocal(conf);
Path inputDir = new Path(args[0]);
Path hdfsFile = new Path(args[1]);
try {
FileStatus[] inputFiles = local.listStatus(inputDir);
FSDataOutputStream out = hdfs.create(hdfsFile);
for (int i=0; i<inputFiles.length; i++) {
System.out.println(inputFiles[i].getPath().getName());
FSDataInputStream in = local.open(inputFiles[i].getPath());
byte buffer[] = new byte[256];
int bytesRead = 0;
while( (bytesRead = in.read(buffer)) > 0) {
out.write(buffer, 0, bytesRead);
}
in.close();
}
out.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
The solution for me was to export the .jar file from this code, and this what I did : Right click on PutMerge project, then export (from the pop-up menu) :
and saved the jar file in a folder named PutMerge on home/hduser directory
In another folder named input (path /home/hduser/input) there are three .txt files as input for PutMerge procedure :
And now we are ready to launch the command from a terminal session : hadoop jar /home/hduser/PutMerge/PutMerge.jar PutMerge /home/hduser/input output4/all
and the command /usr/hadoop/hadoop-2.7.1$ hdfs dfs -cat /output4/all
will contain all the text of the three single files.
Upvotes: 1
Reputation: 2998
If you are using the configuration to run your app for debugging. Make sure you have the checkbox checked for Include Dependencies with Provided Scope if you have any of the dependencies and you have mentioned its scope to provided. It worked for me by following this approach
Upvotes: 0
Reputation: 15879
I had this problem when my maven repository contained corrupted JAR files. Same as you I could see the hadoop-common-x.x.x.jar existed in eclipse when viewing the "Maven Dependencies" of my Java project. However when expanding the JAR file in eclipse and selecting the class named org.apache.hadoop.fs.FSDataInputStream
eclipse was reporting a message something like "Invalid LOC header".
Deleting all files from my local maven repository and executing mvn install
again resolved my issue
Upvotes: 0
Reputation: 47
put like this in your code
Configuration configuration = new Configuration(); configuration.set("fs.hdfs.impl",org.apache.hadoop.hdfs.DistributedFileSystem.class.getName()); configuration.set("fs.file.impl",org.apache.hadoop.fs.LocalFileSystem.class.getName());
Upvotes: 0