Omen
Omen

Reputation: 313

Hadoop hdfs DistributedFileSystem could not be instantiated

I have set up a hadoop hdfs cluster and since I am new to hadoop, I've been trying to follow a simple example to read/write from/to hdfs from a java driver program I am writing in my local machine. The example I am trying to test is the following:

public static void main(String[] args) throws IOException {

    args = new String[3];
    args[0] = "add";
    args[1] = "./files/jaildata.csv";
    args[2] = "hdfs://<Namenode-Host>:<Port>/dir1/dir2/";        
    if (args.length < 1) {
        System.out.println("Usage: hdfsclient add/read/delete/mkdir [<local_path> <hdfs_path>]");
        System.exit(1);
    }

    FileSystemOperations client = new FileSystemOperations();
    String hdfsPath = "hdfs://<Namenode-Host>:<Port>";

    Configuration conf = new Configuration();
    conf.addResource(new Path("file:///user/local/hadoop/etc/hadoop/core-site.xml"));
    conf.addResource(new Path("file:///user/local/hadoop/etc/hadoop/hdfs-site.xml"));

    if (args[0].equals("add")) {
        if (args.length < 3) {
            System.out.println("Usage: hdfsclient add <local_path> <hdfs_path>");
            System.exit(1);
        }
        client.addFile(args[1], args[2], conf);

    } else {
        System.out.println("Usage: hdfsclient add/read/delete/mkdir [<local_path> <hdfs_path>]");
        System.exit(1);
    }
    System.out.println("Done!");
}

Where the addFile function is the following:

public void addFile(String source, String dest, Configuration conf) throws IOException {

    FileSystem fileSystem = FileSystem.get(conf);

    // Get the filename out of the file path
    String filename = source.substring(source.lastIndexOf('/') + 1, source.length());

    // Create the destination path including the filename.
    if (dest.charAt(dest.length() - 1) != '/') {
        dest = dest + "/" + filename;
    } else {
        dest = dest + filename;
    }
    Path path = new Path(dest);
    if (fileSystem.exists(path)) {
        System.out.println("File " + dest + " already exists");
        return;
    }

    // Create a new file and write data to it.
    FSDataOutputStream out = fileSystem.create(path);
    InputStream in = new BufferedInputStream(new FileInputStream(new File(source)));

    byte[] b = new byte[1024];
    int numBytes = 0;
    while ((numBytes = in.read(b)) > 0) {
        out.write(b, 0, numBytes);
    }

    // Close all the file descriptors
    in.close();
    out.close();
    fileSystem.close();
}

The project is a maven project, with hadoop-common-2.6.5, hadoop-hdfs-2.9.0 and hadoop=hdfs-client 2.9.0 added to dependencies and configured to build the jar with all the dependencies included.

My problem, no matter the different demo examples I have tried is that I get the following exception at the point where the FileSystem gets created at FileSystem fileSystem = FileSystem.get(conf);:

Exception in thread "main" java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.hdfs.DistributedFileSystem could not be instantiated
at java.util.ServiceLoader.fail(ServiceLoader.java:232)
at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2565)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2576)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2593)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2632)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2614)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:354)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataOutputStreamBuilder

I have no clue on how to go through and I've tried each of the few solutions I've seen on the net, so I would be grateful for any advice on that...

Thanks.

Upvotes: 1

Views: 7205

Answers (1)

user987339
user987339

Reputation: 10727

org.apache.hadoop.fs.FSDataOutputStreamBuilder class is not in a hadoop-common-2.6.5 but in hadoop-common-2.9.0.

And as I noticed you are already using 2.9.0 versions for hdfs-client. Align other hadoop packages with 2.9.0 in order to avoid similar issues.

Refer in your build to 2.9.0 version for hadoop-common in order to fix this issue.

Upvotes: 2

Related Questions