Reputation: 657
It's best to describe the use by a hypothetical example:
Searching for some useful header info in a big collection of email storage (each email in a separate file). e.g. doing stats of top mail client apps used.
Normally if you do grep you can specify -m to stop at first match but let's say an email does not contact X-Mailer or whatever it is we are looking for in a header? It will scan through the whole email. Since most headers are <50 lines performance could be increased by telling grep to search only 50 lines on any file. I could not find a way to do that.
Upvotes: 2
Views: 1514
Reputation: 161684
Try this command:
for i in *
do
head -n 50 $i | grep -H --label=$i pattern
done
1.txt: aaaaaaaa pattern aaaaaaaa
2.txt: bbbb pattern bbbbb
Upvotes: 1
Reputation: 241741
I don't know if it would be faster but you could do this with awk:
awk '/match me/{print;exit}FNR>50{exit}' *.mail
will print the first line matching match me
if it appears in the first 50 lines. (If you wanted to print the filename as well, grep style, change print;
to print FILENAME ":" $0;
)
awk
doesn't have any equivalent to grep
's -r
flag, but if you need to recursively scan directories, you can use find
with -exec
:
find /base/dir -iname '*.mail' \
-exec awk '/match me/{print FILENAME ":" $0;exit}FNR>50{exit}' {} +
You could solve this problem by piping head -n50
through grep
but that would undoubtedly be slower since you'd have to start two new processes (one head
and one grep
) for each file. You could do it with just one head
and one grep
but then you'd lose the ability to stop matching a file as soon as you find the magic line, and it would be awkward to label the lines with the filename.
Upvotes: 2
Reputation: 3629
you can do something like this
head -50 <mailfile>| grep <your keyword>
Upvotes: 1