How to convert PHP to XML output

Question

I have a php code. this code outputs an HTML. I need to modify this code to output an XML. ANy ideas as to how shall I go about doing this. Is there any XML library available that directly does the job or do i have to manually create each node.?

My php code is:












\n(No results.)/', $d,$nore);


preg_match_all('#((?:[a-z]*(?:&*[.]*)?\s*-*[a-z]*[0-9]*[^<])+)/i',$d,$tit);  //title 

preg_match_all('/\s*\(([\d]*)/',$d,$ye); //movie year working fine

preg_match_all('#\n    Dir: (.*)\n(?:    With:)?#Us',$d,$dir); //director 

preg_match_all('/([\w]*.[\w]*)/i',$d,$rat); //rating 

preg_match_all('/5)
$cnt=5;
else
$cnt=count($tit[1]);



 echo"Search Result";
echo "
";
echo ""$a"of type"$b":";
echo"
";

if(@$nore[1][0]=="No results.")
echo "No movies found!";
else
{
echo "";
  for($j=0;$j<$cnt;$j++)
          {
            echo "";
            echo "";
            echo "";
            echo "";
            echo "";
            echo "";
            echo '';
            echo "";
          }




echo "Image Title Year Director Rating(10) Link to Movie
".@$img[0][$j+2]." ".@$tit[1][$j]." ".@$ye[1][$j]." ".@$dir[1][$j]." ".@$rat[1][$j]." Details";
}               

?>

Expected XML output:

Tivie · Accepted Answer

First thing, you're parsing your html result with regex which is inefficient, unnecessary, and... well, you're answering to the cthulhu call!

Second, parsing IMDB HTML to retrieve results, although valid, might be unnecessary. There are some neat 3rd party APIs that do the job for you, like http://imdbapi.org

If you don't want to use any 3rd party API though, IMHO, you should, instead, parse the HTML using a DOM parser/manipulator, like DOMDocument, for instance, which is safer, better and, at the same time, can solve your HTML to XML problem.

Here's the bit you asked (build XML and HTML from results):

function resultsToHTML($results)
{
    $doc = new DOMDocumet();
    $table = $doc->createElement('table');

    foreach ($results as $r) {
        $row = $doc->createElement('tr');
        $doc->appendChild($row);
        $title = $doc->createElement('td', $r['title']);
        $row->appendChild($title);
        $year = $doc->createElement('td', $r['year']);
        $row->appendChild($year);
        $rating = $doc->createElement('td', $r['rating']);
        $row->appendChild($rating);

        $imgTD = $doc->createElement('td');

        //Creating a img tag (use only on)
        $img = $doc->createElement('img');
        $img->setAttribute('src', $r['img_src']);
        $imgTD->appendChild($img);
        $row->appendChild($imgTD);

        $imgTD = $doc->createElement('td');

        //Importing directly from the old document
        $fauxDoc = new DOMDocument();
        $fauxDoc->loadXML($r['img']);
        $img = $fauxDoc->getElementsByTagName('img')->index(0);
        $importedImg = $doc->importNode('$img', true);
        $imgTD->appendChild($importedImg);
        $row->appendChild($imgTD);
    }
    return $doc;
}

function resultsToXML($results)
{
    $doc = new DOMDocumet();
    $root = $doc->createElement('results');
    foreach ($results as $r) {
        $element = $root->createElement('result');
        $element->setAttribute('cover', $r['img_src']);
        $element->setAttribute('title', $r['title']);
        $element->setAttribute('year', $r['year']);
        $element->setAttribute('rating', $r['rating']);
        $root->appendChild($element);
    }
    $doc->appendChild($root);
    return $doc;
}

to print them you just need to

$xml = resultsToXML($results);
print $xml->saveXML();

Same thing with html

Here's a refactor of your code with DOMDocument, based on your post:

loadHTMLFile($c);

//initialize array to store results
$results = array();

// get table of results and extract a list of rows
$listOfTables = $doc->getElementsByTagName('table');
$rows = getResultRows($listOfTables);

$i = 0;
//loop through all rows to retrieve information
foreach ($rows as $row) {
    if ($title = getTitle($row)) {
        $results[$i]['title'] = $title;
    }
    if (!is_null($year = getYear($row)) && $year) {
        $results[$i]['year'] = $year;
    }
    if (!is_null($rating = getRating($row)) && $rating) {
        $results[$i]['rating'] = $rating;
    }
    if ($img = getImage($row)) {
        $results[$i]['img'] = $img;
    }
    if ($src = getImageSrc($row)) {
        $results[$i]['img_src'] = $src;
    }
    ++$i;
}

//the first result can be a false positive due to the
// results' table header, so we remove it
if (isset($results[0])) {
    array_shift($results);
}

FUNCTIONS

function getResultRows($listOfTables)
{
    foreach ($listOfTables as $table) {
        if ($table->getAttribute('class') === 'results') {
            return $table->getElementsByTagName('tr');
        }
    }
}

function getImageSrc($row)
{
    $img = $row->getElementsByTagName('img')->item(0);
    if (!is_null($img)) {
        return $img->getAttribute('src');
    } else {
        return false;
    }
}

function getImage($row, $doc)
{
    $img = $row->getElementsByTagName('img')->item(0);
    if (!is_null($img)) {
        return $doc->saveHTML($img);
    } else {
        return false;
    }
}


function getTitle($row)
{
    $tdInfo = getTDInfo($row->getElementsByTagName('td'));
    if (!is_null($tdInfo) && !is_null($as = $tdInfo->getElementsByTagName('a'))) {
        return $as->item(0)->nodeValue;
    } else {
        return false;
    }
}


function getYear($row)
{
    $tdInfo = getTDInfo($row->getElementsByTagName('td'));
    if (!is_null($tdInfo) && !is_null($spans = $tdInfo->getElementsByTagName('span'))) {
        foreach ($spans as $span) {
            if ($span->getAttribute('class') === 'year_type') {
                return str_replace(')', '', str_replace('(', '', $span->nodeValue));
            }
        }
    }
}

function getRating($row)
{
    $tdInfo = getTDInfo($row->getElementsByTagName('td'));
    if (!is_null($tdInfo) && !is_null($spans = $tdInfo->getElementsByTagName('span'))) {
        foreach ($spans as $span) {
            if ($span->getAttribute('class') === 'rating-rating') {
                return $span->nodeValue;
            }
        }
    }
}


function getTDInfo($tds)
{
    foreach ($tds as $td) {
        if ($td->getAttribute('class') == 'title') {
            return $td;
        }
    }
}

How to convert PHP to XML output

Answers (1)

Related Questions