Niccolo
Niccolo

Reputation: 401

Matching paragraphs with regex

Can somebody explain me why the following text:

<p>some text some text...</p>
<p>another text another <b>text</b>again</p>

can't be parsed with the following regular expression:

<p>.*?</p>

(to retrieve every paragraph). The regular expression that should match the text between the first opening <p> tag and the last closing </p> tag doesn't work either:

<p>.*</p>

Upvotes: 0

Views: 694

Answers (3)

Bart Kiers
Bart Kiers

Reputation: 170158

Besides the fact that it's dangerous to parse (X)HTMl with regex, try with the RegexOptions.Singleline

Upvotes: 0

Jeff Yates
Jeff Yates

Reputation: 62377

You can't parse HTML with RegEx.

Upvotes: 1

rerun
rerun

Reputation: 25505

My first guess is that you are attempting a multi line match without telling the regex engine to do so. Take a look at the MSDN doc for passing in the flag.

Upvotes: 1

Related Questions