merours
merours

Reputation: 4106

Copying HDFS directory to local node

I'm working on a single node Hadoop 2.4 cluster. I'm able to copy a directory and all its content from HDFS using hadoop fs -copyToLocal myDirectory .

However, I'm unable to successfully do the same operations via this java code :

public void map Object key, Text value, Context context)
                throws IOException, InterruptedException {
    Configuration conf = new Configuration(true);
    FileSystem hdfs = FileSystem.get(conf);
    hdfs.copyToLocalFile(false, new Path("myDirectory"), 
                         new Path("C:/tmp"));
}

This code only copies a part of myDirectory. I also receive some error messages :

14/08/13 14:57:42 INFO mapreduce.Job: Task Id : attempt_1407917640600_0013_m_000001_2, Status : FAILED
Error: java.io.IOException: Target C:/tmp/myDirectory is a directory

My guess is that multiple instances of the mapper are trying to copy the same file to the same node at the same time. However, I don't see why not all the content is copied.

Is that the reason of my errors, and how could I solve it ?

Upvotes: 1

Views: 325

Answers (1)

Aleksei Shestakov
Aleksei Shestakov

Reputation: 2538

You can use DistributedCache (documentation) to copy your files on all datanodes, or you could try to copy files in the setup of your mapper.

Upvotes: 1

Related Questions