Reputation: 31
It is giving count as 2 where as pattern occurred thrice in the string
echo "axxxaaxx" | grep -o "xx" | wc -l
echo "axxxaaxx" | grep -o "xx"
Upvotes: 1
Views: 365
Reputation: 8769
grep
doesnt support overlapping matching of regex. It consumes the characters which get matched. In this case you can enable Perl Compatible Regex (PCRE) using -p
switch and use positive look ahead assertion like this:
$ echo "axxxaaxx" | grep -oP "x(?=x)"
x
x
x
$ echo "axxxaaxx" | grep -oP "x(?=x)" | wc -l
3
$
regex(?=regex2)
Positive look ahead assertion finds all regex1
after which regex2
follows. While matching chars for regex2
it does NOT consume the chars hence that's the reason you get 3 matches.
x(?=x)
Positive look ahead assertion finds all x
that has x
after it.
In the string xxx
, 1st x
matches because it has x
after it, 2nd x
too and 3rd x
doesn't.
More info and easy examples can be found here
Upvotes: 2
Reputation: 47099
Using -P
will enable PCRE which supports lookarounds:
echo "axxxaaxx" | grep -P '(?<=x)x'
In this case we are using a lookbehind which means that we will match an x
which have an x
before it. This makes us able to have overlapping matches:
How the regex is "evaluated":
xxx
^^
|Cursor
Looking for x on this position, since there is nothing this will not match
xxx
^^
|Cursor
Looking for x on this position since it's found we got a match
xxx
^^
|Cursor
Looking for x on this position since it's found we got a match
Upvotes: 1