Reputation:
echo "xxabc jkl" | grep -onP '\w+(?!abc\b)'
1:xxabc
1:jkl
Why the result is not as below?
echo "xxabc jkl" | grep -onP '\w+(?!abc\b)'
1:jkl
The first string is xxabc
which ending with abc.
I want to extract all characters which not ending with abc,why xxabc
matched?
How to fix it,that is to say get only 1:jkl
as output?
Why '\w+(?!abc\b)' can't work?
Upvotes: 2
Views: 822
Reputation: 89557
Without pcregrep special features, you can do it adding a pipe to sed:
echo "xxabc jkl" | sed 's/[a-zA-Z]*abc//g' | grep -onE '[a-zA-Z]+'
or with awk:
echo "xxabc jkl" | awk -F'[^a-zA-Z]+' '{for(i=1;i<=NF;i++){ if ($i!~/abc$/) printf "%s: %s\n",NR,$i }}'
other approach:
echo "xxabc jkl" | awk -F'([^a-zA-Z]|[a-zA-Z]*abc\\>)+' '{OFS="\n"NR": ";if ($1) printf OFS;$1=$1}1'
Upvotes: 1
Reputation: 626806
The \w+(?!abc\b)
pattern matches xxabc
because \w+
matches 1 or more word chars greedily, and thus grabs xxabc
at once. Then, the negative lookahead (?!abc\b)
makes sure there is no abc
with a trailing word boundary immediately to the left of the current location. Since after xxabc
there is no abc
with a trailing word boundary, the match succeeds.
To match all words that do not end with abc
using a PCRE regex, you may use
echo "xxabc jkl" | grep -onP '\b\w+\b(?<!abc)'
See the online demo
Details
\b
- a leading word boundary\w+
- 1 or more word chars\b
- a trailing word boundary(?<!abc)
- a negative lookbehind that fails the match if the 3 letters immediately to the left of the current location are abc
.Upvotes: 1