Not a machine
Not a machine

Reputation: 551

CSS selection using Mojo::DOM

This is a multidisciplinary question so the answer may not be purely CSS.

I am parsing a large table and my goal is to retrieve only the text outside of the <b></b> tags. I am able to select the rows but stuck on how to only select text outside of the bold tag.

HTML

<div id="tab1">
<table width='650' class='subtblfont'>
    <tr><td>&nbsp;</td></tr> 
    <tr><td>&nbsp;</td></tr>        
    <tr>
        <td><b>Check-in Date:&nbsp;</b>04/20/2013</td>
        <td><b>Check-in Date:&nbsp;</b>04/25/2013</td>
    </tr>
</table>

Code

$row_content = $results_dom->find('div#tabs-1 tr:nth-child(3) td');

foreach (@$row_content) {
    print "$_\n";
}

Output

<td><b>Check-in Date:&nbsp;</b>04/20/2013</td>
<td><b>Check-in Date:&nbsp;</b>04/25/2013</td>

Desired Output

04/20/2013
04/25/2013

I am able to use regular expressions to pull out the text but that is not an ideal solution at this point. Is there a way to select only the non-bold text?

Upvotes: 0

Views: 255

Answers (1)

Pat
Pat

Reputation: 2750

From the Documentation:

text

Extract text content from this element only (not including child elements).

Try giving this a shot:

(Granted I don't really know perl, so if I got the syntax wrong... sorry)

$row_content = $results_dom->find('div#tabs-1 tr:nth-child(3) td')->each(sub { say $_->text})

Upvotes: 2

Related Questions