forthrin
forthrin

Reputation: 2777

Finding all divs with a certain class in a HTML document

Using PHP, how can I extract all <div class="this"> even though it appears on different hierarchical levels in a HTML document?

<h3>Hello</h3>
<p>World</p>
<div class="this">
    (lots of random markup, including other divs)
</div>
<div class="this">
    (more random markup, including other divs)
</div>
<div class="inside">
    <div class="this">
        (even more random markup, including other divs)
    </div>
</div>
<p>Bye.</p>

If it's not possible to achieve with regular expressions, does PHP have a built-in library that makes it easy to do something like this (pseudo-code)?

$result = find_all($html, "div", "this");

Desired result:

$result = array(
'<div class="this">
    (lots of random markup, including other divs)
</div>',
'<div class="this">
    (more random markup, including other divs)
</div>',
'<div class="this">
    (even more random markup, including other divs)
</div>',
);

Upvotes: 0

Views: 1038

Answers (3)

MOB
MOB

Reputation: 853

You can use PHP Simple HTML DOM Parser for your work , your code is something like below lines :

    <?php
include_once "simple_html_dom.php";

$html = str_get_html('<h3>Hello</h3><p>World</p><div class="this"> (lots of random markup, including other divs)</div><div class="this"> (more random markup, including other divs)</div><div class="inside"> <div class="this"> (even more random markup, including other divs) </div></div><p>Bye.</p>');

$divs = $html->find('div.this');
$ans=array();
foreach($divs as $div){
$ans[]=$div->outertext;
}

print_r($ans);


?>

Upvotes: 1

SparK
SparK

Reputation: 5211

You need to read your file using DOMDocument method loadHTMLFile or loadHTML.
After that is in a variable you can call $instance->getElementsByTagName("div") which will give you a DOMNodeList. Then you foreach and filter the DOMNodes using getAttribute("class").

Upvotes: 0

bashleigh
bashleigh

Reputation: 9324

PHP is mainly a HTML preprocessor. Well it is. So to do what you're asking you'd have to get the document using get_file_contents() or some AJAX to send the data to your php. The latter seems a bit extreme for what you've asked.

Depending on what you're trying to achieve, I'd personally recommend 'saving' these divs somewhere else before processing them in PHP. Say like a database? Then you can dynamically build these elements based on data in the database.

Use JavaScript for any client side actions, in other words, anything after the page has been generated. Say like getting more data?

Upvotes: 0

Related Questions