Ryan
Ryan

Reputation: 3

Capture multiple lines using Regular Expression (.Net)

I am trying to write a regular expression that can parse the text between < p >< /p > tags. There will be up to 3 lines of text in a row. I thought this might be possible using the (?= search ahead feature.

The code that I am currently using to get one line is as follows.

<p>([^']*?)<[/]p

Is it possible to have one regular expression that can get the text between multiple rows of tags? Each line would need to be in its own group.

An example would be

 <p>The</p>
 <p>Grey</p>
 <p>Fox</p>

Upvotes: 0

Views: 1027

Answers (1)

Mark Byers
Mark Byers

Reputation: 838216

First, this would be easy using the Html Agility Pack and you'd get a more robust solution.

But you can do it with regex in certain situations if you're 100% in control of the format and the input is coming from a trusted source:

Match match = Regex.Match(html, @"(?:<p>(.*?)</p>\s*)+", RegexOptions.Singleline);
if (match.Success)
{
    foreach (Capture line in match.Groups[1].Captures)
        Console.WriteLine(line.Value);
}

Output:

The
Grey
Fox

Upvotes: 2

Related Questions