Waku-2
Waku-2

Reputation: 1196

GNU Parallel as job queue processor

I have a worker.php file as below

<?php

$data = $argv[1];

//then some time consuming $data processing

and I run this as a poor man's job queue using gnu parallel

while read LINE; do echo $LINE; done < very_big_file_10GB.txt  | parallel -u php worker.php 

which kind of works by forking 4 php processes when I am on 4 cpu machine.

But it still feels pretty synchronous to me because read LINE is still reading one line at a time.

Since it is 10GB file, I am wondering if somehow I can use parallel to read the same file in parallel by splitting it into n parts (where n = number of my cpus), that will make my import n times faster (ideally).

Upvotes: 1

Views: 310

Answers (1)

Ole Tange
Ole Tange

Reputation: 33685

No need to do the while business:

parallel -u php worker.php :::: very_big_file_10GB.txt

-u Ungroup output. Only use this if you are not going to use the output, as output from different jobs may mix.

:::: File input source. Equivalent to -a.

I think you will benefit from reading at least chapter 2 (Learn GNU Parallel in 15 minutes) of "GNU Parallel 2018". You can buy it at http://www.lulu.com/shop/ole-tange/gnu-parallel-2018/paperback/product-23558902.html or download it at: https://doi.org/10.5281/zenodo.1146014

Upvotes: 2

Related Questions