Reputation: 13
My perl-grep statement is not capturing all the elements of a large match (~32k characters), but it has no trouble with smaller matches.
The grep command I want to use in order to grab "allowed [ < TEXT > ]":
grep -P '(?si)^\s*allowed\s*\[.*?\]' file.txt
For some reason, if the file is large-ish, the dot stops matching lines. Therefore the above grep doesn't match anything because '.*?\]' can't eat enough to find the ']'.
grep -P '(?si)^\s*allowed\s*\[.*' bigFile.txt | wc
1883 1883 32764
But it can still consume the entire file using .*:
grep -P '(?si).*' bigFile.txt | wc
10003 10003 178910
BigFile.txt:
allowed
[
com.bar.baz1
com.bar.baz2
....
com.bar.baz10000
]
As you can see, the BigFile should be matched in its entirety. Instead it stops after about 32k characters, about at line 1880.
I am using Grep2.5.1. My best guess is that this version of grep can only match about 2^15=32768 characters from within a pattern...
For comparison, on another machine running grep 2.6.3, the following works fine
grep -Pzo '(?si)^\s*allowed\s*\[.*?\]' bigFile.txt
Upvotes: 0
Views: 118
Reputation: 531185
You're using a non-greedy operator in one command:
grep -P '(?si)^\s*allowed\s*\[.*?\]' file.txt
^^
and a greedy operator in the other:
grep -P '(?si)^\s*allowed\s*\[.*' bigFile.txt | wc
^
This may cause differences in how grep
matches your file.
Upvotes: 1