Reputation: 23
The task is to run multiple UNIX grep commands on log files using subprocess module at the same time. Running these grep commands synchronously is time consuming and hence want to parallelize the same.
Grep commands that I want to run in parallel:
grep "start" /var/log/application/start.log.gz
grep "end" /var/log/application/end.log.gz
grep "proceed" /var/log/application/proceed.log.gz
Should I choose to use asyncio or opt for gevents?
Upvotes: 1
Views: 128
Reputation: 1
Q : Should I choose to use
asyncio
or opt forgevents
?
Well, better neither one. Why to pay the extra costs three times (?) if can have the parallel
run it?
If there exists an indeed reasonable motivation for such tasks to get launched but ALAP and concurrently ( i.e. under one common shared "ceiling" for disk-I/O and mem-I/O throughput ), yet with minimum overhead add-on costs - measure them, before downvoting without having the hard facts in hand ( right, python process-based execution comes at a huge --not anything near a few [us]
-- add-on costs of process-instantiation, the worse in Windows-class O/S, where a full copy of the whole python interpreter state, incl. its all variables, objects and all their memory-rich data-structures, was first to get re-created for each sub-process (do read the multiprocessing
documentation details on process-based parallelism and about additional consistency risks, for which fork
instantiation method started to be considered unsafe on O/S that permit it ( not all do in 2020-Q1 ) )
Better try using rather but one process with the just-enough, right-designed, smart and well-tuned tool for doing the very that at once, in parallel, the gnu parallel
( best start with reading the man parallel
for all the ready-to use parallel
job-submission options configurables )
Example :
Grep commands that I want to run in parallel:
Python can launch but this and handle the log and outputs accordingly, as needed
parallel --jobs 3 \
--halt now,fail=1 \
--joblog LastRUN.log \
grep {} /var/log/application/{}.log.gz ::: start proceed end
will result in equally processed
grep "start" /var/log/application/start.log.gz
grep "end" /var/log/application/end.log.gz
grep "proceed" /var/log/application/proceed.log.gz
( plus may serve it with more comfort on how to handle failed cases, logged outputs etc. )
Upvotes: 1