Reputation: 1786
I want to extract some data from a table using php preg_match_all(). I have the html as under, I want to get the values in td, say Product code: RC063154016. How can I do that? I don'y have any experience with regex,
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td><span>Product code:</span> RC063154016</td>
<td><span>Gender:</span> Female</td>
</tr>
</tbody>
</table>
Upvotes: 0
Views: 1193
Reputation: 153
Use any one parser and parse the HTML and use it. Don't use preg* functions here. Please read this answer How do you parse and process HTML/XML in PHP?
Upvotes: 0
Reputation: 3813
$data = <<<HTML
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td><span>Product code:</span> RC063154016</td>
<td><span>Gender:</span> Female</td>
</tr>
</tbody>
</table>
HTML;
if(preg_match_all('#<td>\s*<span>Product code:</span>\s*([^<]*)</td>#i', $data, $matches)) {
print_r($matches);
}
Upvotes: 0
Reputation: 39355
This should do for you:
preg_match_all('|<td><span>Product code:</span>([^<]*)</td>|', $html, $match);
But if you think there can be random white spaces around tags, then this one:
preg_match_all('|<td>\s*<span>\s*Product code:\s*</span>([^<]*)</td>|', $html, $match);
Upvotes: 0
Reputation: 1899
Use DomDocument
$str = <<<STR
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td><span>Product code:</span> RC063154016</td>
<td><span>Gender:</span> Female</td>
</tr>
</tbody>
</table>
STR;
$dom = new DOMDocument();
@$dom->loadHTML($str);
$tds = $dom->getElementsByTagName('td');
foreach($tds as $td){
echo $td->nodeValue . '<br>';
}
Product code: RC063154016
Gender: Female
Upvotes: 3