J. LaRosee
J. LaRosee

Reputation: 992

Validating an RSS feed

I'm working on an application that allows users to add their own RSS feeds to a simple reader of sorts.

Currently, I'm using xml_domit_rss as the parser but I'm unsure if it's actually validating the URL before parsing.

From what I can gather online, it looks as if validating is separate from the parse, either by using a service https://www.feedvalidator.org or some other method such as parse_url().

Anyone have some insight into either how xml_domit_rss validates, or a method by which I can validate before sending the URL to the parser?

Upvotes: 1

Views: 2819

Answers (5)

Huy - Logarit
Huy - Logarit

Reputation: 686

try this code

function validateFeed( $sFeedURL )
{

    $sValidator = 'http://feedvalidator.org/check.cgi?url=';

    if( $sValidationResponse = @file_get_contents($sValidator . urlencode($sFeedURL)) )
    {
        if( stristr( $sValidationResponse , 'This is a valid RSS feed' ) !== false )
        {
            return true;
        }
        else
        {
            return false;
        }
    }
    else
    {
        return false;
    }
}

?>

Upvotes: 0

davethegr8
davethegr8

Reputation: 11595

This is my quick and dirty solution that worked for me under similar circumstances

foreach($sources as $source) {
    if(!$source["url"]) {
        continue;
    }

    $rss = curl_request($source["url"]);
    $rss = str_replace('&', '&', $rss);

    $parser = xml_parser_create();
    if(xml_parse($parser, $rss)) {
        $xmle = new SimpleXMLElement($rss);
    }
    else {
        $xmle = null;
        continue;
    }

    //other stuff here
}

I make sure to replace the ampersands with &, because not doing that can cause issues with the SimpleXMLElement parser and entities such as • or —

The xml_parse returns 1 on success, so you can check it with a straight if statement. Then using the SimpleXMLElement to traverse the RSS feed makes things nice and easy.

Upvotes: 0

Lukas Šalkauskas
Lukas Šalkauskas

Reputation: 14361

It's simple, You can do that using SyndicationFeed. It supports Atom 1.0 and RSS 2.0 versions.

try 
{
    SyndicationFeed fetchedItems = SyndicationFeed.Load(XmlReader.Create(feedUrl));
    // Validation successful.
} 
catch { // Validation failed. };

Upvotes: 0

Grey Panther
Grey Panther

Reputation: 13118

Validating in the context of XML files (and hence RSS/Atom feeds which use XML to encode the values) means to use a document schema which describes the expected structure of the XML file (which elements can have what child elements, what attributes can be present, etc).

Now some XML parsers require a schema and bork (this is a technical term :-) - refuse to parse) on XML files not conforming to the schema. Now seeing how you are parsing arbitrary RSS, probably it is the best to skip validating and make the best effort of parsing the RSS feed. Also, you could show the parse results to the user (similar to how Google Reader does it when you add a new feed) and let her judge if the result looks ok.

Unfortunately the XML parser used by this code seems to be unfortunately dead and I can't find any detail how strict or lax it is in its parsing...

Upvotes: 0

Johannes Weiss
Johannes Weiss

Reputation: 54031

You could validate the RSS with a RelaxNG schema. Schemas for all the different feed formats should be available online...

Upvotes: 1

Related Questions