Reputation: 291
I have the following list of produtcs (in a .txt file) :
#ART#NC3FX;price1
#ART#NC3FX;price2
#ART#NC3FX;price3
#ART#NC3FXX;price1
#ART#NC3FXX;price2
#ART#NC3FXX;price3
#ART#NC3FXX;price1
#ART#NC3FXX;price2
#ART#NC3FXX;price3
#ART#NC3FX-HD;price1
#ART#NC3FX-HD;price2
#ART#NC3FX-HD;price3
I'd like to get all the occurrences of the first one (ART#NC3FX).
Using this regular expression
@"(^|\b)#ART#NC3FX(\b|$)";
I retrieve the first three lines, which is fine, but I also get the lines for the reference #ART#NC3FX-HD.
What should I do to prevent this from happening ?
Thanks !
Upvotes: 1
Views: 75
Reputation: 2352
Im not sure if i understand your answer correctly, but why dont you look for the first ; like:
@"^#ART#NC3FX(;|$)"
EDIT: See Avinash's Answer
Upvotes: 1
Reputation: 626747
Your regex finds a match because the -
hyphen is not a word character, and you tell the regex engine (with \b
) that the character after D
should be a non-word character. So, you get a match.
You may use a negative lookahead:
@"\B#ART#NC3FX(?![\w-])"
See regex demo
The \B
will match a position at the beginning of the string or a non-word boundary, and (?![\w-])
lookahead will fail a match if the string is followed with a word character or a hyphen. If you test independent strings replace \B
with ^
(start of string).
Upvotes: 2