Reputation: 12475
I have a string like this.
<p class='link'>try</p>bla bla</p>
I want to get only <p class='link'>try</p>
I have tried this.
/<p class='link'>[^<\/p>]+<\/p>/
But it doesn't work.
How can I can do this? Thanks,
Upvotes: 3
Views: 125
Reputation: 560
I tried to make one less specific to any particular tag.
(<[^/]+?\s+[^>]*>[^>]*>)
this returns:
<p class='link'>try</p>
Upvotes: 0
Reputation: 4879
It looks like you used this block: [^<\/p>]+
intending to match anything except for </p>
. Unfortunately, that's not what it does. A []
block matches any of the characters inside. In your case, the /<p class='link'>[^<\/p>]+
part matched <p class='link'>try</
, but it was not immediately followed by the expected </p>
, so there was no match.
Alex's solution, to use a non-greedy qualifier is how I tend to approach this sort of problem.
Upvotes: 0
Reputation: 490607
If that is your string, and you want the text between those p
tags, then this should work...
/<p\sclass='link'>(.*?)<\/p>/
The reason yours is not working is because you are adding <\/p>
to your not character range. It is not matching it literally, but checking for not each character individually.
Of course, it is mandatory I mention that there are better tools for parsing HTML fragments (such as a HTML parser.)
Upvotes: 4