megatr0n
megatr0n

Reputation: 362

Parsing HTML with Php

I cant get the data between the tags into the arrays:

// Load the HTML string from file and create a SimpleXMLElement
$html_string = file_get_contents("data/csr.html"); /*the string really is in $html_string*/
$root = new SimpleXMLElement($html_string);

Problem starts here when I try to get that the value between the tags: div, h2 and span into an array

// Fetch all div, h2 and span values
$divArray = $hdlsArray = $dtlsArray = array();
    foreach ($root->div as $div) {
    $divArray[] = $div;
    echo "".$div."<br />";
}
foreach ($root->h2 as $h2) {
    $hdlsArray[] = $h2;
    echo "".$h2."<br />";
}
foreach ($root->span as $span) {
    $dtlsArray[] = $span;
    echo "".$span."<br />";
}

The result of this is a blank page instead of printing the actual tag data

Upvotes: 0

Views: 244

Answers (2)

Salman Arshad
Salman Arshad

Reputation: 272106

As an alternate to SimpleXMLElement, I suggest Simple HTML DOM (online manual). I've used it before and very much satisfied with the results. It allows you to use jQuery like selectors so fetching all div, h2 and span values is fairly simple.

Upvotes: 2

pavium
pavium

Reputation: 15118

This page says (about SimpleXML) "the only problem with it is that it'll only load valid XML" but may provide a workaround for HTML.

The 'Related Questions' on StackOverflow include this one, but it describes HTML inside valid XML tags.

Upvotes: 1

Related Questions