BenB
BenB

Reputation: 2907

PHP Simple Dom - get immediate element after element similar to + CSS selector

I want to extract data from HTML with this structure:

<html>
  <body>
     <table>
        <tr>
            <td>
                <table>
                    <tr>
                        <td>
                            <table>
                                <tr>
                                    <td>
                                        <table>
                                            <tr>
                                                <td>TD1
                                                    <table>
                                                        <tr>
                                                            <td>TD2
                                                                <table>
                                                                    <tr>
                                                                        <td>TD3</td>
                                                                    </tr>
                                                                </table>
                                                            </td>
                                                        </tr>
                                                    </table>
                                                </td>
                                            </tr>
                                        </table>
                                    </td>
                                </tr>
                            </table>
                        </td>
                    </tr>
                </table>
            </td>
        </tr>
    </table>
</body>

I would like to get this text result once:

TD1 TD2 TD3

When I try with simple php dom

foreach($html->find('body + table + table + table + table') as $element) 
   echo $element->innertext . '<br>';

I get this result:

TD1 TD2 TD3

TD2 TD3

TD3

Seems like php Dom library doesn't work with the + css selector, so it founds the element "body + table + table + table + table" a few times and not only the immediate one body > table > table > table > table.

How could I get only the outer tags once so the result would be TD1 TD2 TD3 ? In the HTML in one page there are multiple times this structure in the same page, so I'm looking for something similar to the + CSS selector to get all occurrences of this element body + table + table + table + table in page.

Upvotes: 0

Views: 304

Answers (1)

Francis Eytan Dortort
Francis Eytan Dortort

Reputation: 1447

You could try Symfony's DomCrawler component. It's filter() method accepts CSS selectors (with a few minor exceptions, see here.)

Upvotes: 0

Related Questions