Hadoop Pseudo-Distributed : SSH command

Question

I have a single machine, in my university, with hadoop configured in pseudo-distributed mode and I need to control it from home.

If i connect from SSH I have some problem:

If i launch this command:

./hadoop jar 'my.jar' hdfs://localhost:54310

then the jar must be on computer with Hadoop. Is there is a solution to run a jar that is on my home computer?

Similarly, how I can use get/put command to get/put from/to my home computer and the HDFS filesystem?

For now I have a dropbox folder where I "put and move" the file, but isn't a very clean solution.

Another big problem is that if I run a jar through SSH and then I close the SSH connection, the work stop. But I need to start a work on Hadoop and power off my home computer. Is there a solution for this problem?

Chaos · Accepted Answer

Here are my answers to your questions:

The jar file must be on the system with Hadoop installed in order to run it.
If you're running a windows environment on your home computer, you can use WinSCP to get/put files from your home computer to the Hadoop system. Then you'll have to issue a hadoop fs -put or hadoop fs -get command to put/get files from HDFS to the local FS on the hadoop system. I'm not aware of an easy way to get/put files from your home computer to HDFS. If you're running a unix environment, you can just issue an SCP command from your terminal/console.
Yes, if you SSH into a machine, issue a command & then close the SSH connection, the execution stops. You can however, run the command as a background process, and the execution will continue even after you close the SSH connection. You need to append an ampersand: & to the end of your command. Example:
```
./hadoop jar 'my.jar' hdfs://localhost:54310 & 
```

EDIT

Command to redirect output to file:

./hadoop jar 'my.jar' hdfs://localhost:54310 > outputFile &

Hadoop Pseudo-Distributed : SSH command

Answers (1)

Related Questions