turbo
turbo

Reputation: 1273

Bash: Unexpected parallel behavior when reading arguments from file using xargs

Previous

This is a follow-up to this question.

Specs

My system is a dedicated server running Ubuntu Desktop, Release 12.04 (precise) 64-bit, 3.14.32-xxxx-std-ipv6-64. Neither release or kernel can be upgraded, but I can install any package.

Problem

The problem discribed in the question above seems to be solved, however this doesn't work for me. I've installed the latest lftp and parallel packages and they seem to work fine for themselves.

To simplify things, I cleaned the input file end_unique.txt, now it has the following format for each line:

<server>

Each line ends in a CRLF, because it is imported from a windows server.

Edit 1:

This is the job.sh script:

#/bin/sh
server="$1"
lftp -e "find .; exit" "$server" >"$server-files.txt"

Edit 2:

I took the file and ran it against fromdos. Now it should be standard unix format, one server per line. Keep in mind that the server in the file can vary in format:

ftp.server.com
www.server.com
server.com
123.456.789.190

etc. All of those servers are ftp servers, accessible by ftp://<serverfromfile>/.

Upvotes: 1

Views: 191

Answers (1)

Wintermute
Wintermute

Reputation: 44063

With :::, parallel expects the list of arguments it needs to complete the commands it's going to run to appear on the command line, as in

parallel -j20 ./job.sh ::: server1 server2 server3

Without ::: it reads the arguments from stdin, which serves us better in this case. You can simply say

parallel -j20 ./job.sh < end_unique.txt

Addendum: Things that can go wrong

Make certain two things:

  1. That you are using GNU parallel and not another version (such as the one from moreutils), because only (as far as I'm aware) the GNU version supports reading an argument list from stdin, and
  2. That GNU parallel is not configured to disable the GNU extensions. It turned out, after a lengthy discussion in the comments, that they are disabled by default on Ubuntu 12.04, so it is not inconceivable that this sort of thing might be found elsewhere (particularly downstream from Ubuntu). Such a configuration can hide in

    • The environment variable $PARALLEL,
    • /etc/parallel/config, or
    • ~/.parallel/config

If the GNU version of parallel is not available to you, and if your argument list is not too long for the shell and none of the arguments in it contain whitespaces, the same thing with the moreutils parallel is

parallel -j20 job.sh -- $(cat end_unique.txt)

This did not work for OP because the file contained more servers than the shell was willing to put into a command line, but it might work for others with similar problems.

Upvotes: 2

Related Questions