oe a
oe a

Reputation: 2280

Regex woes and parsing a string correctly

I'm trying to match parse a string through regex. This is what I have so far:

 private string result =  @"Range:\s*(?<start>.+\S)\s*to\s*(?<end>.+\S)[\S\s]+For more information, click the link below";

And code to parse:

start = Convert.ToDateTime(matches.Groups["start"].Value)
end = Convert.ToDateTime(matches.Groups["end"].Value)

Here's an example string input:

Range:Jun 8, 2016 to Jun 9, 2016
For more information, click the link below

The start variable is as expected below:

6/8/2016 12:00:00 AM

The end variable is throwing an error on formatting as DateTime. When I output the value of the end regex match, it comes out like this:

9 Jun 2016 For more infor.....

What am I missing in my regex?

Upvotes: 0

Views: 48

Answers (3)

S.Ahl
S.Ahl

Reputation: 69

Try this website. The regex it generates is a little long but it's worked for me.

Upvotes: 0

trincot
trincot

Reputation: 350345

You would have the result you describe if the text For more information, click the link below does not appear on a separate line.

If the newline character is not following the date, .+ will consume all characters until the next newline, which will only be matched by \s with the string. This is because + is greedy. To make it lazy, add the question mark. Because it is lazy, you don't really need the \S within the capture groups:

Range:\s*(.+?)\s*to\s*(.+?)\s*For more information, click the link below

Upvotes: 0

Xiaoy312
Xiaoy312

Reputation: 14477

Use this pattern :

@"Range:(?<start>\w+ \d+, \d+) to (?<end>\w+ \d+, \d+)"

Just in case, you do need to match the 2nd part :

@"Range:(?<start>\w+ \d+, \d+) to (?<end>\w+ \d+, \d+)\r\nFor more information, click the link below";

Upvotes: 1

Related Questions