Reputation: 7
<table class="trailer">
------------------Begin---------------------
<tbody><tr>
<td class="newtrailer-text">
Trailer 2<br>
</td></tr>
<br>
<b>(Yahoo)</b><br>
<b>(High Definition)</b><br>
<a href="http://playlist.yahoo.com/makeplaylist.dll?sid=107193280&sdm=web&pt=rd">(1080p)</a><br>
<a href="http://playlist.yahoo.com/makeplaylist.dll?sid=107193279&sdm=web&pt=rd">(720p)</a><br>
<a href="http://playlist.yahoo.com/makeplaylist.dll?sid=107193272&sdm=web&pt=rd">(480p)</a><br>
<br>
<b>(Warner Bros.)</b><br>
<b>(High Definition)</b><br>
<a href="http://pdl.warnerbros.com/wbmovies/inception/trl_3/Inception_TRLR3_1080.mov">(1080p)</a><br>
<a href="http://pdl.warnerbros.com/wbmovies/inception/trl_3/Inception_TRLR3_720.mov">(720p)</a><br>
<a href="http://pdl.warnerbros.com/wbmovies/inception/trl_3/Inception_TRLR3_480.mov">(480p)</a>=
--------------END----------------
</tbody></table>
How would I get all the data between begin and end? I've tried the following with no results. Any help would be appreciated. Thanks.
$regex = '#<td class="newtrailer-text">([^"]+)</tbody></table>#si';
Upvotes: 0
Views: 149
Reputation:
Here's the canonical link for why you should use DOM to parse (X)HTML: The pony, he comes.
But here's the deal with your regex:
([^"]+)
will only match everything up to the first occurrence of a double-quote "
. Your regex specifies that the first double quote must occur immediately before the </tbody>
tag or no match will be found.
Instead, try:
$regex = '#<td class="newtrailer-text">(.+)</tbody></table>#siU';
if (preg_match($regex, $str, $m)) {
echo $m[1];
} else {
echo 'No match';
}
Upvotes: 2
Reputation: 784938
You can use non-greedy RrgEx like this:
if (preg_match_all('#------------------Begin---------------------(.*?)--------------END----------------#s', $str, $m) )
print_r ( $m[1] );
Upvotes: 1