Reputation: 313
I have set up a hadoop hdfs cluster and since I am new to hadoop, I've been trying to follow a simple example to read/write from/to hdfs from a java driver program I am writing in my local machine. The example I am trying to test is the following:
public static void main(String[] args) throws IOException {
args = new String[3];
args[0] = "add";
args[1] = "./files/jaildata.csv";
args[2] = "hdfs://<Namenode-Host>:<Port>/dir1/dir2/";
if (args.length < 1) {
System.out.println("Usage: hdfsclient add/read/delete/mkdir [<local_path> <hdfs_path>]");
System.exit(1);
}
FileSystemOperations client = new FileSystemOperations();
String hdfsPath = "hdfs://<Namenode-Host>:<Port>";
Configuration conf = new Configuration();
conf.addResource(new Path("file:///user/local/hadoop/etc/hadoop/core-site.xml"));
conf.addResource(new Path("file:///user/local/hadoop/etc/hadoop/hdfs-site.xml"));
if (args[0].equals("add")) {
if (args.length < 3) {
System.out.println("Usage: hdfsclient add <local_path> <hdfs_path>");
System.exit(1);
}
client.addFile(args[1], args[2], conf);
} else {
System.out.println("Usage: hdfsclient add/read/delete/mkdir [<local_path> <hdfs_path>]");
System.exit(1);
}
System.out.println("Done!");
}
Where the addFile
function is the following:
public void addFile(String source, String dest, Configuration conf) throws IOException {
FileSystem fileSystem = FileSystem.get(conf);
// Get the filename out of the file path
String filename = source.substring(source.lastIndexOf('/') + 1, source.length());
// Create the destination path including the filename.
if (dest.charAt(dest.length() - 1) != '/') {
dest = dest + "/" + filename;
} else {
dest = dest + filename;
}
Path path = new Path(dest);
if (fileSystem.exists(path)) {
System.out.println("File " + dest + " already exists");
return;
}
// Create a new file and write data to it.
FSDataOutputStream out = fileSystem.create(path);
InputStream in = new BufferedInputStream(new FileInputStream(new File(source)));
byte[] b = new byte[1024];
int numBytes = 0;
while ((numBytes = in.read(b)) > 0) {
out.write(b, 0, numBytes);
}
// Close all the file descriptors
in.close();
out.close();
fileSystem.close();
}
The project is a maven project, with hadoop-common-2.6.5
, hadoop-hdfs-2.9.0
and hadoop=hdfs-client 2.9.0
added to dependencies and configured to build the jar with all the dependencies included.
My problem, no matter the different demo examples I have tried is that I get the following exception at the point where the FileSystem
gets created at FileSystem fileSystem = FileSystem.get(conf);
:
Exception in thread "main" java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.hdfs.DistributedFileSystem could not be instantiated
at java.util.ServiceLoader.fail(ServiceLoader.java:232)
at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2565)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2576)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2593)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2632)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2614)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:354)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataOutputStreamBuilder
I have no clue on how to go through and I've tried each of the few solutions I've seen on the net, so I would be grateful for any advice on that...
Thanks.
Upvotes: 1
Views: 7205
Reputation: 10727
org.apache.hadoop.fs.FSDataOutputStreamBuilder
class is not in a hadoop-common-2.6.5
but in hadoop-common-2.9.0
.
And as I noticed you are already using 2.9.0 versions for hdfs-client
.
Align other hadoop packages with 2.9.0 in order to avoid similar issues.
Refer in your build to 2.9.0 version for hadoop-common in order to fix this issue.
Upvotes: 2