Reputation: 41
<br />
Your coupon for 50% off MSRP - Inline is: XXXXXXXXXXX<br />
Your coupon for 50% off MSRP - Outdoor is: XXXXXXXXXXX<br /><br />
I wish to parse out the coupon code.
I current have is(.+?)<br>
but its also including the <br>
at the end.
Upvotes: 2
Views: 84
Reputation: 9041
Try a lookbehind/lookahead pattern like this:
".*?coupon.*?(?<=: )(\\w+)(?=<br />|<br/>)"
It matches alphanumeric data, into capture group 1, that has the word "coupon"
and is between the ": "
and "<br />"
or <br/>"
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string html = "<br />\n" +
"Your coupon for 50% off MSRP - Inline is: XXXXXXXXXXX<br />" +
"Your coupon for 50% off MSRP - Outdoor is: XXXXXXXXXXX<br /><br />";
MatchCollection matches = Regex.Matches(html, ".*?coupon.*?(?<=: )(\\w+)(?=<br />|<br/>)");
foreach (Match match in matches)
{
Console.WriteLine(match.Groups[1]);
}
}
}
Results:
XXXXXXXXXXX
XXXXXXXXXXX
Upvotes: 1
Reputation: 7407
You should be able to do this without even using Regex. Something like
string s = "Your coupon for 50% off MSRP - Outdoor is: XXXXXXXXXXX";
Console.WriteLine(s.Substring(s.LastIndexOf(' ') + 1));
should work as long as the coupon code is always the last part of the string, with a space prefixing it.
EDIT: one alternative after seeing your edit and that the strings are wrapped in <br>
, you could always .Replace the match results with an empty string-
string s = "Your coupon for 50% off MSRP - Outdoor is: XXXXXXXXXXX<br>";
Console.WriteLine(s.Substring(s.LastIndexOf(' ') + 1).Replace("<br>",""));
Upvotes: 0