Reputation: 609
I'm trying to use Perl's negative look-ahead regular expression to exclude certain string from targeted string. Please give me your advice.
I was trying to get strings which do not have -sm, -sp, or -sa.
REGEX:
hostname .+-(?!sm|sp|sa).+
INPUT
hostname 9amnbb-rp01c
hostname 9tlsys-eng-vm-r04-ra01c
hostname 9tlsys-eng-vm-r04-sa01c
hostname 9amnbb-sa01
hostname 9amnbb-aaa-sa01c
Expected Output:
hostname 9amnbb-rp01c - SELECTED
hostname 9tlsys-eng-vm-r04-ra01c - SELECTED
hostname 9tlsys-eng-vm-r04-sa01c
hostname 9amnbb-sa01
hostname 9amnbb-aaa-sa01c
However, I got this actual Output below:
hostname 9amnbb-rp01c - SELECTED
hostname 9tlsys-eng-vm-r04-ra01c - SELECTED
hostname 9tlsys-eng-vm-r04-sa01c - SELECTED
hostname 9amnbb-sa01
hostname 9amnbb-aaa-sa01c - SELECTED
Please help me.
p.s.: I used Regex Coach to visualize my result.
Upvotes: 4
Views: 450
Reputation: 208405
Move the .+-
inside of the lookahead:
hostname (?!.+-(?:sm|sp|sa)).+
Rubular: http://www.rubular.com/r/OuSwOLHhEy
Your current expression is not working properly because when the .+-
is outside of the lookahead, it can backtrack until the lookahead no longer causes the regex to fail. For example with the string hostname 9amnbb-aaa-sa01c
and the regex hostname .+-(?!sm|sp|sa).+
, the first .+
would match 9amnbb
, the lookahead would see aa
as the next two characters and continue, and the second .+
woudl match aaa-sa01c
.
An alternative to my current regex would be the following:
hostname .+-(?!sm|sp|sa)[^-]+?$
This would prevent the backtracking because no -
can occur after the lookahead, the non-greedy ?
is used so that this would work correctly in a multiline global mode.
Upvotes: 4
Reputation: 639
The following passes your testcases:
hostname [^-]+(-(?!sm|sp|sa)[^-]+)+$
I think it is a little easier to read than F.J.'s answer.
To answer Rudy: the question was posed as an exclusion-of-cases situation. That seems to fit negative lookahead well. :)
Upvotes: 1