Reputation: 4106
I'm working on a single node Hadoop 2.4 cluster.
I'm able to copy a directory and all its content from HDFS using hadoop fs -copyToLocal myDirectory .
However, I'm unable to successfully do the same operations via this java code :
public void map Object key, Text value, Context context)
throws IOException, InterruptedException {
Configuration conf = new Configuration(true);
FileSystem hdfs = FileSystem.get(conf);
hdfs.copyToLocalFile(false, new Path("myDirectory"),
new Path("C:/tmp"));
}
This code only copies a part of myDirectory
. I also receive some error messages :
14/08/13 14:57:42 INFO mapreduce.Job: Task Id : attempt_1407917640600_0013_m_000001_2, Status : FAILED
Error: java.io.IOException: Target C:/tmp/myDirectory is a directory
My guess is that multiple instances of the mapper are trying to copy the same file to the same node at the same time. However, I don't see why not all the content is copied.
Is that the reason of my errors, and how could I solve it ?
Upvotes: 1
Views: 325
Reputation: 2538
You can use DistributedCache
(documentation) to copy your files on all datanodes, or you could try to copy files in the setup of your mapper.
Upvotes: 1