Reputation: 1393
I have an xml with 10 records and the structure is:
<entry>
<title>My Title</title>
<link rel="alternate" type="text/html" href="http://myweb.com/posts/one.html"/>
<published>2014-07-07T00:34:00+00:00</published>
<updated>2014-07-07T00:34:00+00:00</updated>
<id>http://myweb.com/posts/one.html</id>
<author>
<name>Myweb.com</name>
</author>
<content>
Some Content Here
</content>
<link rel="enclosure" href="http://myweb.com/uploads/300px-300px.jpg" type="image/jpeg" length=""/>
</entry>
I am using the code bellow to parse it and its almost working great except that i can't fetch the image url that is in the duplicate line:
<link rel="enclosure" href="http://myweb.com/uploads/300px-300px.jpg" type="image/jpeg" length=""/>
My code is:
$url = "http://myweb.com/posts.xml";
$xml = simplexml_load_file($url);
foreach($xml->entry as $PRODUCT) {
$my_title = trim($PRODUCT->title);
$url = trim($PRODUCT->id);
$im = (string)$PRODUCT->xPath('//link[@rel="enclosure"]');
echo $my_title . " " . $url . " " . $im;
echo "<br>";
}
This: $im = (string)$PRODUCT->xPath('//link[@rel="enclosure"]');
Returns "Array" and not the url inisde href.
Thanks
Upvotes: 0
Views: 104
Reputation: 97783
This:
$im = (string)$PRODUCT->xPath('//link[@rel="enclosure"]');
Returns "Array" and not the url inisde href.
Whenever you see a string containing the word "Array" in PHP, where you were expecting something else, you need to think "hm, I seem to have cast an array to a string, how did that happen?" (Similarly, if you unexpectedly see the string "A", consider the possibility that it's a one-letter substring of "Array").
In this case, the reason why is quite simple: if you look up the manual page for the SimpleXMLElement::xpath()
method, you'll see that it returns an array unless there is an error (not finding a match is not an error, and will give you an empty array).
The only reason this is surprising, is that most methods on that class return another instance of the same class, with magic overloads for things like the (string)
cast. However, all of those objects represent a more-or-less coherent fragment of the XML document (e.g. 1 or more consecutive nodes, or siblings filtered by a particular tag-name), and can never represent "nothing". An XPath result could be empty, or contain nodes of various types from all over the document; I don't know for sure, but I suspect this is why an array return was chosen here rather than another variety of SimpleXMLElement
object.
So $PRODUCT->xPath('//link[@rel="enclosure"]')[0]
will give you the first result (or $xpath_results = $PRODUCT->xPath('//link[@rel="enclosure"]'); $im = $xpath_results[0]
if you can't rely on at least PHP 5.4, or want to insert a check in between for no nodes being matched).
There are a few extra catches here, though:
$product->registerXpathNamespace('atom', 'http://www.w3.org/2005/Atom');
and then use it in your XPath expression (e.g. //atom:link
rather than //link
).href
attribute: either change your XPath expression to select it (//link[@rel="enclosure"]/@href
) or change your access to grab it from the SimpleXMLElement
returned ($xpath_results[0]['href']
).Stick it all together (and get rid of that ugly and unusual all-caps variable name), and the compact version (no error checking, minimum readability) would be either:
$product->registerXpathNamespace('atom', 'http://www.w3.org/2005/Atom');
(string)$product->xPath('//atom:link[@rel="enclosure"]')[0]['href']
or
$product->registerXpathNamespace('atom', 'http://www.w3.org/2005/Atom');
(string)$product->xPath('//atom:link[@rel="enclosure"]/@href')[0]
Upvotes: 2
Reputation: 19502
That looks like it is part of an Atom feed. This means it has a namespace. To use Xpath on an XML with namespaces, you have to register an alias/prefix the namespace. This is a little complex with SimpleXML, you have to do it on each element, you're calling the xpath() method and it will always return an array of SimpleXMLElement objects.
$feed = simplexml_load_string($xml);
foreach($feed->entry as $product) {
$product->registerXpathNamespace('atom', 'http://www.w3.org/2005/Atom');
var_dump((string)$product->xpath('//atom:link[@rel="enclosure"]')[0]['href']);
}
Demo: https://eval.in/170439
With DOMXpath this is more easier, the namespaces only need to be registered on the DOMXpath object once and DOMXpath::evaluate() can return scalar values. The second argument is the context for the Xpath expression:
$dom = new DOMDocument();
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
$xpath->registerNamespace('atom', 'http://www.w3.org/2005/Atom');
foreach($xpath->evaluate('//atom:entry') as $product) {
var_dump($xpath->evaluate('string(atom:link[@rel="enclosure"]/@href)', $product));
}
Demo: https://eval.in/170444
Upvotes: 1