Aaron Yodaiken
Aaron Yodaiken

Reputation: 19551

simplexml php get xml

If I have a document like this:

<!-- in doc.xml -->
<a>
  <b>
    greetings?
    <c>hello</c>
    <d>goodbye</c>
  </b>
</a>

Is there any way to use simplexml (or any php builtin really) to get a string containing:

greetings?
<c>hello</c>
<d>goodbye</d>

Whitespace and such doesn't matter.

Thanks!

Upvotes: 0

Views: 374

Answers (3)

salathe
salathe

Reputation: 51950

Here's an alternative using DOM (to balance the SimpleXML answers!) that outputs the contents of all of the first <b> element.

$doc = new DOMDocument;
$doc->load('doc.xml');
$bee = $doc->getElementsByTagName('b')->item(0);

$innerxml = '';
foreach ($bee->childNodes as $node) {
    $innerxml .= $doc->saveXML($node);
}
echo $innerxml;

Upvotes: 0

Wiseguy
Wiseguy

Reputation: 20873

I must admit this wasn't as simple as one would think. This is what I came up with:

$xml = new DOMDocument;
$xml->load('doc.xml');

// find just the <b> node(s)
$xpath = new DOMXPath($xml);
$results = $xpath->query('/a/b');

// get entire <b> node as text
$node = $results->item(0);
$text = $xml->saveXML($node);

// remove encapsulating <b></b> tags
$text = preg_replace('#^<b>#', '', $text);
$text = preg_replace('#</b>$#', '', $text);

echo $text;

Regarding the XPath query

The query returns all matching nodes, so if there are multiple matching <b> tags, you can loop through $results to get them all.

My query for '/a/b' assumes that <a> is the root and <b> is its child/immediate descendant. You could alter it for different scenarios. Here's an XPath reference. Some adjustments might include:

  • 'a/b' –– <b> is child of <a>, but <a> is anywhere, not just in the root
  • 'a//b' –– <b> is a descendant of <a> no matter how deep, not just a direct child
  • '//b' –– all <b> nodes anywhere in the document

Regarding method of obtaining string contents

I tried using $node->nodeValue or $node->textContent, but both of them strip out the <c> and <d> tags, leaving just the text contents of those. I also tried casting it as a DOMText object, but that didn't directly work and was more trouble than it was worth.

Regarding the use of regular expressions

It could be done without regex, but I found it easiest to use them. I wanted to make sure that I only stripped the <b> and </b> at the very beginning and end of the string, just in case there were other <b> nodes within the contents.

Upvotes: 2

Osh Mansor
Osh Mansor

Reputation: 1232

How about this? Since you already know the XML format:

<?php
$xml = simplexml_load_file('doc.xml'); 
$str = $xml->b;
$str .= "<c>".$xml->b->c."</c>";
$str .= "<d>".$xml->b->d."</d>";

echo $str;
?>

Upvotes: 0

Related Questions