Reputation: 186
I have to find words.
In my assignment a word is defined as letters between two spaces (" bla "). I have to find a decimalIntegerConstant like this but it has to be a word.
I use
grep -E -o " (0|[1-9]+[0-9]*)([Ll]?) "
but it doesn't work on, for example:
bla 0l labl 2 3 abla0La 0L sfdgpočítačsd
Output is
0l
2
0L
but 3
is missing.
Upvotes: 1
Views: 382
Reputation: 119877
Matches don't overlap. Your regex have matched 2
. The blank after 2
is gone. It won't be considered for further matches.
POSIX grep
cannot do what you want in one step, but you can do something like this in two stages (simplified from your regex, doesn't support [lL]
)
grep -o ' [0-9 ]* ' | grep -E -o '[0-9]+'
That is, match a sequence of space-separated numbers with leading and trailing spaces, and from that, match individual numbers regardless of spaces. De-simplify the definition of number to suit your needs.
Perl-compatible regular expressions have a way to match stuff without consuming it, for example, as mentioned in the comments:
grep -oP " (0|[1-9]+[0-9]*)[Ll]?(?= )"
(?= )
is a lookahead assertion, which means grep
will look ahead in the input stream and make sure the match is followed by a space. The space will not be considered a part of the match and will not be consumed. When no space is found, the match fails.
PCRE are not guaranteed to work in all implementations of grep
.
Edit: -o
is not specified by Posix either.
Upvotes: 1