Reputation: 3912
This is the sort of HTML string I will be performing matches on:
<span class="q1">+12 Spell Power and +10 Hit Rating</span>
I want to get +12 Spell Power and +10 Hit Rating
out of the above HTML. This is the code I wrote:
preg_match('/<span class="q1">(.*)<\/span>/', $gem, $match);
But due to <\/span>
it's escaping the /
in </span>
so it doesn't stop the match, so I get a lot more data than what I want.
How can I escape the /
in </span>
while still having it part of the pattern?
Thanks.
Upvotes: 3
Views: 3905
Reputation: 89102
Don't use regex to parse HTML. Use an HTML parser. See Robust, Mature HTML Parser for PHP.
Upvotes: 2
Reputation: 7728
I think the reason that your regex is getting more than you want is because * is greedy, matching as much as possible. Instead, use *?, which will match as little as possible:
preg_match('/<span class="q1">(.*?)<\/span>/', $gem, $match);
Upvotes: 3
Reputation: 186552
loadHTML
method and getElementsByTagName('span')
-
$doc = new DOMDocument();
$doc->loadHTML($htmlString);
$spans = $doc->getElementsByTagName('span');
if ( $spans->length > 0 ) {
// loop on $spans
}
Upvotes: 2