Lacrifilm
Lacrifilm

Reputation: 283

DomDocument get all divs and put inside an array

I have have some divs with the same Id and same Class as you can see below:

<div id="results_information" class="control_results">
<!-- I have divs, subDivs, span, images inside -->
</div>

<div id="results_information" class="control_results">
<!-- I have divs, subDivs, span, images inside -->
</div>

....

In my case I want to save all of them inside an array to be used later, I want to save in this format:

[0] => '<div id="results_information" class="control_results">
<!-- I have divs, subDivs, span, images inside -->
</div>',

[1] => '<div id="results_information" class="control_results">
<!-- I have divs, subDivs, span, images inside -->
</div>',

....

For that I'm using this code below:

$dom = new DOMDocument(); // Create DOMDocument object.
$dom->loadHTMLFile($htmlOut); // Load target file.
$div =$dom->getElementById('results_information'); // Take all div elements.

But it doesn't work, how I can solve this problem and put my divs inside an array?

Upvotes: 1

Views: 1235

Answers (2)

user3372120
user3372120

Reputation: 144

To solve your problem you need to do the following steps below:

First of all, you should be based on selecting a class and not an ID (Because id in this situation should be unique).

In this situation we assume that you have the following html inside a variable called $htmlOut:

<div id="results_information" class="control_results">
<span style="background:black; color:white">
hellow world
</span>
<strong>2</strong>
</div>

<div id="results_information" class="control_results">
<strong>2</strong>
<img src="hello.png" />
</div>

We need to extract all the html that exists inside theses two class called control_results and put inside an array, for this we need to work with DomDocument and DomXPath:

$array = array();
$dom = new DomDocument();
$dom->loadHtml($htmlOut);
$finder = new DomXPath($dom);
$classname = "control_results";
$nodes = $finder->query("//*[contains(@class, '$classname')]");

With that code we can extract all the content of the divs with classname control_results and put inside the variable $nodes.

Now we need to parser the variable $nodes (that is an array) and extract all the HTML of that two class. For this I create a function to handle:

function get_inner_html( $node ) { 
    $innerHTML= ''; 
    $children = $node->childNodes; 
    foreach ($children as $child) { 
        $innerHTML .= $child->ownerDocument->saveXML( $child ); 
    } 

    return $innerHTML;  
}  

This function will extract every childNodes (Every HTML code inside the class control_results) and returns.

Now you only need to create a foreach for the variable $nodes and call that function, like this:

foreach ($nodes as $rowNode) {
    $array[] = get_inner_html($rowNode);
}

var_dump($array);

Below is the complete code:

$htmlOut = '
<div id="results_information" class="control_results">
<span style="background:black; color:white">
hellow world
</span>
<strong>2</strong>
</div>

<div id="results_information" class="control_results">
<strong>2</strong>
<img src="hello.png" />
</div>
';

$array = array();
$dom = new DomDocument();
$dom->loadHtml($htmlOut);
$finder = new DomXPath($dom);
$classname = "control_results";
$nodes = $finder->query("//*[contains(@class, '$classname')]");

foreach ($nodes as $rowNode) {
    $array[] = get_inner_html($rowNode);
}

var_dump($array);


function get_inner_html( $node ) { 
    $innerHTML= ''; 
    $children = $node->childNodes; 
    foreach ($children as $child) { 
        $innerHTML .= $child->ownerDocument->saveXML( $child ); 
    } 

    return $innerHTML;  
}  

But this code has a little problem, if you check the results in array is:

 0 => string '<span style="background:black; color:white">hellow world</span><strong>2</strong>',

 1 => string '<strong>2</strong><img src="hello.png"/>'

instead of:

 0 => string '<div id="results_information" class="control_results"><span style="background:black; color:white">hellow world</span><strong>2</strong></div>',

 1 => string '<div id="results_information" class="control_results"><strong>2</strong><img src="hello.png"/></div>'

In this case you can perform a foreach of this array and include that div in the init of the contents and close that div in the final of the contents and re-save that array.

Upvotes: 3

Abdullah A Malik
Abdullah A Malik

Reputation: 354

You will need to use xpath and get the elements using class name.

$dom = new DOMDocument(); 
$xpath = new DOMXpath($dom);
$div = $xpath->query('//div[contains(@class, "control_results")]')

Upvotes: 1

Related Questions