Lim Neo
Lim Neo

Reputation: 75

preg_match issue to retrieve data from table

 <h3 style="border-bottom: 3px solid #CCC;" class="margint15 marginb15">Headlines</h3> 
 <table cellpadding="0" cellspacing="0" border="0" class="nc" width="100%"> 
     <tr>
        <th class="left" colspan="2">Latest Headlines</th>
     </tr>
     <tr>
        <td class="left" width="620"> <a href="/blogs/rhb/79680.jsp" style="color:#06a;">Trading Stocks - 10
     July 2015 - Globetronics | A&M | Salcon | Comintel | Homeritz |
     MMSV</a> </td>
    </tr>
</table>

I want to extract the data from the tag "" which the class="nc" until the end of the tag "". How to write the pattern for preg_match?

Upvotes: 1

Views: 67

Answers (2)

Vinod Patidar
Vinod Patidar

Reputation: 685

You should go with this:

$str = '<h3 style="border-bottom: 3px solid #CCC;" class="margint15 marginb15">Headlines</h3><table cellpadding="0" cellspacing="0" border="0" class="nc" width="100%"> <tr><th class="left" colspan="2">Latest Headlines</th></tr> <tr><td class="left" width="620"> <a href="/blogs/rhb/79680.jsp" style="color:#06a;">Trading Stocks - 10 July 2015 - Globetronics | A&M | Salcon | Comintel | Homeritz | MMSV</a> </td></tr></table>';
preg_match_all('/<table.*?>(.*?)<\/table>/si', $str, $matches);

echo "<pre>";
print_r( strip_tags($matches[1][0]) );
die();

Thanks!

Upvotes: 0

Jan
Jan

Reputation: 43169

Really, this has been discussed here like a thousand times, better not use some regular expression to grab html tags (there may be cases in which in works quite well though). For the sake of the christmas spirit, here's an example for your purpose (scraping financial data of a site that is not yours ;-)) Consider using an XML parser instead:

<?php
$str='<container>
<h3 style="border-bottom: 3px solid #CCC;" class="margint15
marginb15">Headlines</h3>  <table cellpadding="0" cellspacing="0"
border="0" class="nc" width="100%"> <tr><th class="left"
colspan="2">Latest Headlines</th></tr> <tr><td class="left" width="620"> <a
href="/blogs/rhb/79680.jsp" style="color:#06a;">Trading Stocks - 10
July 2015 - Globetronics | A&amp;M | Salcon | Comintel | Homeritz |
MMSV</a> </td></tr></table>
</container>';
$xml = simplexml_load_string($str);
print_r($xml);

// now you can loop over the table rows with
foreach ($xml->table->tr as $row) {
    // do whatever you want with it
    // child elements can be accessed likewise
}
?>

Hint: Obviously, I made up the container tag, it's likely to be html in your case.

Appendix: As Scuzzy points out, make yourself familiar with xpath (here's a good starting point), the combination is extremely powerful.

Upvotes: 1

Related Questions