Reputation: 4634
I have the following string of grep
s
grep -E '[0-9]{3}\.[0-9]+ ms' file.log | grep -v "Cycle Based" | grep -Ev "[0-9]{14}\.[0-9]+ ms" > pruned.log
Which I need to run on a 10G log file. It's taking a bit longer than I am willing to wait so I am trying to use GNU parallel
, but it's not clear to me how I can execute this chain of grep
s using parallel
.
This is not a question of how to execute the fastest possible single grep
, this is about how to execute a series of grep
s in parallel
Upvotes: 1
Views: 83
Reputation: 33685
Usually the limiting factor when grepping a file is the disk. If you have a single disk, then odds are that this will be limiting you.
However, if you have RAID10/50/60 or a distributed network filesystem, then parallelizing may speed up your processing:
doit() {
grep -E '[0-9]{3}\.[0-9]+ ms' | grep -v "Cycle Based" | grep -Ev "[0-9]{14}\.[0-9]+ ms"
}
export -f doit
parallel --pipepart -a file.log --block -1 -k doit > pruned.log
Upvotes: 2