Reputation: 568
From what I read if, I want to capture specific words that I could use word boundaries.
I have the following information
First: 12345Z Apr 16 Something WORD: ABC Notification #1234 Key1 Key2 Key3 Key4
Second: 12345Z Apr 16 Something WORD: ABC Notification #1234 Key5 Key3 Key6
Third: 12345Z Apr 16 Something WORD: ABC Notification #1234 Key7 Key6
I used the following regex and it will only match Key7
when I need to match Key3 Key4 Key6 Key7
(?<=#\d{4}\s)(\b(Key3|Key4|Key6|Key7)\b)
Upvotes: 2
Views: 540
Reputation: 627468
The issues is that your regex does not let anything between the #
+4 digits and the keys you are interested in. You can easily fix it by adding an optional (?:\s+\w+)*
pattern to the lookbehind that will match zero or more sequences of 1+ whitespace and 1+ word characters:
(?<=#\d{4}(?:\s+\w+)*)\s*\b(Key3|Key4|Key6|Key7)\b
^^^^^^^^^^^
See the regex demo, declare with the verbatim string literal in C# (or use as is in VB.NET) and use with Regex.Matches
.
A C# demo:
var strs = new List<string> { "First: 12345Z Apr 16 Something WORD: ABC Notification #1234 Key1 Key2 Key3 Key4",
"Second: 12345Z Apr 16 Something WORD: ABC Notification #1234 Key5 Key3 Key6",
"Third: 12345Z Apr 16 Something WORD: ABC Notification #1234 Key7 Key6"};
foreach (var s in strs) {
var result = Regex.Matches(s, @"(?<=#\d{4}(?:\s+\w+)*)\s*\b(Key3|Key4|Key6|Key7)\b")
.Cast<Match>()
.Select(m => m.Value)
.ToList();
foreach (var e in result)
Console.WriteLine(e);
}
Upvotes: 1
Reputation: 2557
Try this
(?:#\d{4}|\G)\s(\b(?:Key3|Key4|Key6|Key7)\b)
or this
(?:#\d{4}|\G) \b(?:(Key3|Key4|Key6|Key7)|\w+)\b
Explanation:
(?: … )
: Non-capturing group sample
\
: Escapes a special character sample
|
: Alternation / OR operand sample
\G
: Beginning of String or End of Previous Match sample
\s
: "whitespace character": space, tab, newline, carriage return, vertical tab sample
( … )
: Capturing group sample
Upvotes: 1