Free Url
Free Url

Reputation: 1966

Using GNU Parallel for cluster computing over LAN with rsync

I have two machines, and I want to use GNU Parallel to have multiple processes 'cat' the contents of some text files from both machines.

I have the following setup.

On a local machine, in the same directory, I have the following files:

This is if I am using the nodefile example from wordpress link (below), and my IP is 192.168.0.2.

None of these files are replicated on the remote machine. I want to have multiple processes 'cat' the contents of each of the test?.txt files from both machines.

Preferably, this:

I have been able to execute multiprocessing commands remotely with the nodefile as per this wordpress example, but none involving file echoing remotely.

So far, I have something like the following:

parallel --sshloginfile nodefile --workdir . --basefile cmd.sh -a cmd.sh --trc ::: test1.txt test2.txt test3.txt

But this isn't working and is removing the files from my directory and not replacing them, as well as giving rsync errors. I (unfortunately) can't provide the errors at the moment, or replicate the setup.

I am very inexperienced with parallel, can anyone guide me on the syntax to accomplish this task? I haven't been able to find the answer (so far) in the man pages or on the web.

Running Ubuntu 16.04 LTS and using latest version of GNU Parallel.

Upvotes: 1

Views: 363

Answers (1)

Ole Tange
Ole Tange

Reputation: 33685

You make a few mistakes:

  • -a is used to give an input source. It is basically an alias for ::::
  • you do not give the command to run after the options to GNU Parallel and before the :::
  • --trc takes an argument (namely the file to transfer back). You do not have a file to transfer back, so use --transfer --cleanup instead.

So:

chmod +x cmd.sh
parallel --sshloginfile nodefile --workdir . --basefile cmd.sh --transfer --cleanup ./cmd.sh ::: test1.txt test2.txt test3.txt

It is unclear if you want to transfer anything to the remote machine, so maybe this is really the correct answer:

parallel --sshloginfile nodefile --nonall --workdir . ./cmd.sh test1.txt test2.txt test3.txt

Upvotes: 1

Related Questions