Reputation: 31
I'm trying to take a string find if the string contains a word, in my case the word "Message", and then if it contains that word find the word directly after it. My relevant code so far is as follows.
public bool Find(string Word,string Text)
{
return Text.Contains(Word);
}
And then it uses the function in various ways but for this specific purpose It needs to find "Message" as follows
if (Find("Message", MessageText))
{
//I don't know what to put here
}
I need to take the string MessageText and Then find Message within the string and then output the first word after the word Message. e.g "Whatever random string Message Brad and more random string" I want to output Brad
Upvotes: 2
Views: 981
Reputation: 18611
Use
var word = "someword";
var regex = new Regex(string.Format(@"(?<!\w){0}\W+(\w+)", Regex.Escape(word)));
var match = regex.Match(text);
if (match.Success)
{
Console.WriteLine(match.Groups[1].Value);
}
Regex.Escape(word)
is in case word
contains +
, [
, (
or other special characters. (?<!\w)
is better than \b
, as it will match correctly even if word
starts with special character. \W+
is better than \s+
because it matches any non-word characters between two words.
See regex proof.
Explanation
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
\w word characters (a-z, A-Z, 0-9, _)
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
Message 'Message'
--------------------------------------------------------------------------------
\W+ non-word characters (all but a-z, A-Z, 0-
9, _) (1 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \1
Upvotes: 1
Reputation: 658
If we can assume your words are broken up by something other than upper case/lower case changes, I would use something like the following:
var regex = new Regex(string.Format(@"\b{0}\b\s+(?<nextWord>\w+)", word));
var match = regex.Match(text);
return match.Groups["nextWord"].Value;
I created a dotnetfiddle to demonstrate: https://dotnetfiddle.net/ypXYpQ
The \b pieces are looking for word boundaries, so I inject the word in between two of those, look for 1 or more whitespace characters with the \s+ piece, then capture the word
characters using the \w+ piece. The code also demonstrates how to get the next word out and return it from your function (or whatever you need to do next).
I first got this expression working by using regex101.com. It is an excellent tool that will give an explanation of how your regex is working, gives you a searchable quick reference, and lets you save and create tests for your expressions.
Upvotes: 0