oswcab
oswcab

Reputation: 105

GNU Parallel: Argument list too long when calling function

I created a script to verify a (big) number of items and it was doing the verification in a serial way (one after the other) with the end result of the script taking about 9 hours to complete. Looking around about how to improve this, I found GNU parallel but I'm having problems making it work.

The list of items is in a text file so I was doing the following:

readarray items < ${ALL_ITEMS}
export -f process_item
parallel process_item ::: "${items[@]}"

Problem is, I receive an error:

GNU parallel: Argument list too long

I understand by looking at similar posts 1, 2, 3 that this is more a Linux limitation than a GNU parallel one. From the answers to those posts I also tried to extrapolate a workaround by piping the items to head but the result is that only a few items (the parameter passed to head) are processed.

I have been able to make it work using xargs:

cat "${ALL_ITEMS}" | xargs -n 1 -P ${THREADS} -I {} bash -c 'process_item "$@"' _ {}

but I've seen GNU parallel has other nice features I'd like to use.

Any idea how to make this work with GNU parallel? By the way, the number of items is about 2.5 million and growing every day (the script run as a cron job).

Thanks

Upvotes: 3

Views: 2217

Answers (2)

user000001
user000001

Reputation: 33317

You can pipe the file to parallel, or just use the -a (--arg-file) option. The following are equvalent:

cat "${ALL_ITEMS}" | parallel process_item 
parallel process_item < "${ALL_ITEMS}"
parallel -a "${ALL_ITEMS}" process_item
parallel --arg-file "${ALL_ITEMS}" process_item
parallel process_item :::: "${ALL_ITEMS}"

Upvotes: 3

ArturFH
ArturFH

Reputation: 1787

From man parallel:

parallel [options] [command [arguments]] < list_of_arguments

So:

export -f process_item
parallel process_item < ${ALL_ITEMS}

probably does what you want.

Upvotes: 4

Related Questions