Reputation: 33
I'm trying to ignore some text in my regex which occurs quite rarely. My regex is
var Runners = new Regex(@"(?<=y, |f, |m, )(.*?)(?= runners\))").Matches(set);
The line in question is
Anthony Mildmay, Peter Cazalet Memorial Handicap Chase (Sponsored By Ing Barings) <span class=aside>3m 5f 110y</span></h2><ul class=list><li>(5yo+, 3m 5f 110y, 16 runners)
there is an extra 'y, ' at the beginning so it pick up too much data as in this example all I want is my regex to find '16'.
I don't think this could happen often but it stopped on record 134 of 216424 with this error. Is there a way perhaps of only looking 10 spaces behind the word runners to look for 'y, ' or 'f, ' or 'm, '? or maybe look for 1 number followed by 'y, ' or 'f, ' or 'm, '?
Upvotes: 2
Views: 267
Reputation: 1024
Using Lucero's example works for your string. Only thing you will have to remove is the '*' from the \s (seems to throw a pattern error).
(?<=[yfm],\s)\d+(?=\s*runners\))
Put your string in regex101 and use that expression, it finds 16.
Upvotes: 0
Reputation: 60190
This may work for you:
(?<=[yfm],\s*)\d+(?=\s*runners\))
Using .*
is always "dangerous" (in that it may match something different than anticipated), even when it is not greedy. Try to make your patterns as specific as possible to get correct matches.
Upvotes: 1