mpet
mpet

Reputation: 1014

Extract specific values from HTML table using regex

I have a html file that contains this table row:

<tr> 
<td class="color21 right" style="font-size:12px; line-height:1.2;">&nbsp;Location</td>
<td class="color21" style="font-size:12px;">10</td>
<td class="color21" style="font-size:12px;"><img src="../../icons/9.gif" alt="Type" />     </td>
<td class="color21" style="font-size:12px;">3</td>
<td class="color21" style="font-size:12px;">7</td>
<td class="color21" style="font-size:12px;"><img src="../../icons/11.gif" alt="Type" />    </td>
<td class="color21" style="font-size:12px;">3</td>
<td class="color21" style="font-size:12px;">10</td>
<td class="color21" style="font-size:12px;"><img src="../../icons/9.gif" alt="Type" />    </td>
</tr>

I'm retrieving file contents using file_get_contents.

How can I extract all TD values using preg_match, preg_match_all?

Upvotes: 1

Views: 742

Answers (2)

Nambi
Nambi

Reputation: 12042

Use the DomParser to Parse the html content regex are not reliable on this cases.

    $str=file_get_contents('read.txt');
    $dom = new domDocument;
    $dom->loadHTML($str);
    $tr = $dom->getElementsByTagName('td');
    foreach($tr as $td)
  {
    if(!empty($td->nodeValue)){
        echo $td->nodeValue."\n";
    }else{
        $images=$td->getElementsByTagName('img');
        foreach($images as $image){
            echo $image->getAttribute('src')." ";
            echo $image->getAttribute('alt');
        }
    }

Upvotes: 1

Amit Joki
Amit Joki

Reputation: 59232

Think over if you really wanna a regex to parse html

But you can use this:

<td.+?>(.+?)</td>

The first group will contain the values of <td>

Upvotes: 1

Related Questions