Sachin
Sachin

Reputation: 18747

How to copy files from Hadoop cluster to local file system

Setup:

I have a map-reduce job. In the mapper class (which is obviously running on the cluster), I have a code something like this:

try {
.
.
.
} catch (<some exception>) {
    // Do some stuff
}

What I want to change:

In the catch{} clause, I want to copy the logs from the cluster to the local file system

Problem:

I can see the log file in the directory on the node if I check from command line. But when I try to copy it use org.apache.hadoop.fs.FileSystem.copyToLocalFile(boolean delSrc, Path src, Path dst), it says the file does not exist.

Can anyone tell me what I am doing wrong? I am very new to Hadoop, so may be I am missing something obvious. Please ask ask me any clarifying questions, if needed, as I am not sure if I have given all the necessary inforation.

Thanks

EDIT 1:: Since I am trying to copy files from cluster to local and the java code is also running on cluster, can I even use copyToLocalFile()? Or do I need to do a simple scp?

Upvotes: 1

Views: 1688

Answers (1)

Niranjan Sarvi
Niranjan Sarvi

Reputation: 899

The MapReduce log files are usually located on the data node's local file system path HADOOP_LOG_DIR/userlogs/mapOrReduceTask where the Map/Reduce program runs. Each MapReduce programs generates syslog/stdout/stderr in the above directory.

It would be easier to use the Task tracker's Web UI to see the local log files or you can ssh to the machine and look over logs in the above mentioned directories.

By default, the Task tracker Web UI URL is http://machineName:50060/

Upvotes: 1

Related Questions