Reputation: 41
I'm trying to create a regex that will match two words (in order) but cannot have another word/characters between them.
I need a match when "Spanish" & "Audio" are not separated by "<br />"
Test String:
Dolby Digital Audio 2.0 Language French<br /> Dolby Digital 5.1
Audio Language Spanish<br /> Dolby Digital Audio Language 7.1
English<br /> Subtitles Language Spanish <br />
False positive:
/Audio.*((?!\<br\ \>).).*Spanish/i
What am I doing wrong here?
Upvotes: 1
Views: 137
Reputation: 600
If I'm understanding your question correctly, you'd like to capture one or more words between "Audio" and "Spanish", unless those words contain <br />
.
The first .*
matches <br />
, and then the negative lookahead matches the space between <br />
and Spanish
.
Audio\s*((?:(?!<br\ \/>).)*?)\s*Spanish
Broken down a bit:
Audio
\s*
( # the capture group
(?:
(?!<br\ \/>). # any character such that it doesn't begin the string "<br />"
)*? # 0+ times; lazy
)
\s*
Spanish
You can see it in action.
The above is an edited post; previous iterations:
Audio\s*((?!\s*\<br\ \/>).*?)\s*Spanish
Thanks to Christian for pointing out that the above would match if <br />
were preceded by non-space characters, e.g. Audio foo <br /> Spanish
.
Audio\s*((?!.*\<br\ \/>).*?)\s*Spanish
This was still pretty flawed and failed if there was a trailing <br />
after "Spanish".
Upvotes: 2