Zsolt Szilagyi
Zsolt Szilagyi

Reputation: 5016

RSS parsing with PHP and SimpleXML: How to enter namespaced items?

I am parsing the following RSS feed (relevant part shown)

<item>
    <title>xxx</title>
    <link>xxx</link>
    <guid>xxx</guid>
    <description>xxx</description>
    <prx:proxy>
        <prx:ip>101.226.74.168</prx:ip>
        <prx:port>8080</prx:port>
        <prx:type>Anonymous</prx:type>
        <prx:ssl>false</prx:ssl>
        <prx:check_timestamp>1369199066</prx:check_timestamp>
        <prx:country_code>CN</prx:country_code>
        <prx:latency>20585</prx:latency>
        <prx:reliability>9593</prx:reliability>
    </prx:proxy>
    <prx:proxy>...</prx:proxy>
    <prx:proxy>...</prx:proxy>
    <pubDate>xxx</pubDate>
</item>
<item>...</item>
<item>...</item>
<item>...</item>

Using the php code

$proxylist_rss = file_get_contents('http://www.xxx.com/xxx.xml');
$proxylist_xml = new SimpleXmlElement($proxylist_rss);

foreach($proxylist_xml->channel->item as $item) {

    var_dump($item); // Ok, Everything marked with xxx
    var_dump($item->title); // Ok, title

    foreach($item->proxy() as $entry) {
        var_dump($entry); //empty

    }

}

While I can access everything marked with xxx, I cannot access anything inside prx:proxy - mainly because : cannot be present in valid php varnames.

The question is how to reach prx:ip, as example.

Thanks!

Upvotes: 0

Views: 1435

Answers (3)

Expedito
Expedito

Reputation: 7795

Try it like this:

$proxylist_rss = file_get_contents('http://www.xxx.com/xxx.xml');
$feed = simplexml_load_string($proxylist_rss);
$ns=$feed->getNameSpaces(true);
foreach ($feed->channel->item  as $item){
    var_dump($item);
    var_dump($item->title); 
    $proxy = $item->children($ns["prx"]);
    $proxy = $proxy->proxy;
    foreach ($proxy as $key => $value){
        var_dump($value);
    }
}

Upvotes: 1

Anthony Sterling
Anthony Sterling

Reputation: 2441

Take a look at SimpleXMLElement::children, you can access the namespaced elements with that.

For example: -

<?php
$xml = '<xml xmlns:prx="http://example.org/">
<item>
    <title>xxx</title>
    <link>xxx</link>
    <guid>xxx</guid>
    <description>xxx</description>
    <prx:proxy>
        <prx:ip>101.226.74.168</prx:ip>
        <prx:port>8080</prx:port>
        <prx:type>Anonymous</prx:type>
        <prx:ssl>false</prx:ssl>
        <prx:check_timestamp>1369199066</prx:check_timestamp>
        <prx:country_code>CN</prx:country_code>
        <prx:latency>20585</prx:latency>
        <prx:reliability>9593</prx:reliability>
    </prx:proxy>
</item>
</xml>';

$sxe = new SimpleXMLElement($xml);
foreach($sxe->item as $item)
{
    $proxy = $item->children('prx', true)->proxy;
    echo $proxy->ip; //101.226.74.169
}

Anthony.

Upvotes: 2

JimL
JimL

Reputation: 2541

I would just strip out the "prx:"...

$proxylist_rss = file_get_contents('http://www.xxx.com/xxx.xml');
$proxylist_rss = str_replace('prx:', '', $proxylist_rss);

$proxylist_xml = new SimpleXmlElement($proxylist_rss);

foreach($proxylist_xml->channel->item as $item) {
    foreach($item->proxy as $entry) {
        var_dump($entry);
    }
}

http://phpfiddle.org/main/code/jsz-vga

Upvotes: 2

Related Questions