Jaeh
Jaeh

Reputation: 609

Regular expression - Negative look-ahead

I'm trying to use Perl's negative look-ahead regular expression to exclude certain string from targeted string. Please give me your advice.

I was trying to get strings which do not have -sm, -sp, or -sa.

REGEX:

hostname .+-(?!sm|sp|sa).+

INPUT

hostname 9amnbb-rp01c
hostname 9tlsys-eng-vm-r04-ra01c
hostname 9tlsys-eng-vm-r04-sa01c
hostname 9amnbb-sa01
hostname 9amnbb-aaa-sa01c

Expected Output:

hostname 9amnbb-rp01c              - SELECTED
hostname 9tlsys-eng-vm-r04-ra01c   - SELECTED 
hostname 9tlsys-eng-vm-r04-sa01c
hostname 9amnbb-sa01
hostname 9amnbb-aaa-sa01c

However, I got this actual Output below:

hostname 9amnbb-rp01c              - SELECTED
hostname 9tlsys-eng-vm-r04-ra01c   - SELECTED
hostname 9tlsys-eng-vm-r04-sa01c   - SELECTED
hostname 9amnbb-sa01
hostname 9amnbb-aaa-sa01c          - SELECTED

Please help me.

p.s.: I used Regex Coach to visualize my result.

Upvotes: 4

Views: 450

Answers (2)

Andrew Clark
Andrew Clark

Reputation: 208405

Move the .+- inside of the lookahead:

hostname (?!.+-(?:sm|sp|sa)).+

Rubular: http://www.rubular.com/r/OuSwOLHhEy

Your current expression is not working properly because when the .+- is outside of the lookahead, it can backtrack until the lookahead no longer causes the regex to fail. For example with the string hostname 9amnbb-aaa-sa01c and the regex hostname .+-(?!sm|sp|sa).+, the first .+ would match 9amnbb, the lookahead would see aa as the next two characters and continue, and the second .+ woudl match aaa-sa01c.

An alternative to my current regex would be the following:

hostname .+-(?!sm|sp|sa)[^-]+?$

This would prevent the backtracking because no - can occur after the lookahead, the non-greedy ? is used so that this would work correctly in a multiline global mode.

Upvotes: 4

RichTBreak
RichTBreak

Reputation: 639

The following passes your testcases:

hostname [^-]+(-(?!sm|sp|sa)[^-]+)+$

I think it is a little easier to read than F.J.'s answer.

To answer Rudy: the question was posed as an exclusion-of-cases situation. That seems to fit negative lookahead well. :)

Upvotes: 1

Related Questions