Rob
Rob

Reputation: 6370

RSS feed not validating because of substr cutting html characters

Currently unable to get my rss feed to validate through W3C RSS Validator. It seems there's a problem with the time/date. If you click the W3C link it'll show the errors. When I comment out the date it works fine but the date is kinda crucial!!

Here's the original script:

  include "db.php";

  header("Expires: 0");
  header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");
  header("cache-control: no-store, no-cache, must-revalidate");
  header("Pragma: no-cache");
  header("Content-type: text/xml");
  print "<?xml version=\"1.0\" encoding=\"utf-8\" ?>";

?>
<rss version="2.0">
  <channel>
    <title>MediWales Events</title>
    <description>The latest Events, updates and announcements from MediWales.</description>
    <link>http://www.mediwales.com</link>
    <copyright>Copyright 2011 MediWales.</copyright>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <language>en-us</language>
    <lastBuildDate><? print date("D, d M Y H:i:s"); ?> 0000</lastBuildDate>
    <managingEditor>[email protected]</managingEditor>
    <pubDate><? print date("D, d M Y H:i:s"); ?> 0000</pubDate>

    <webMaster>[email protected]</webMaster>
    <generator>codeworks rss script (1.0.0)</generator>
    <image>
      <url>http://mediwales.com/login/uploaded/template/logo.png</url>
      <title>MediWales Website</title>
      <link>http://www.mediwales.com</link>
      <description>The latest Events, updates and announcements from MediWales.</description>
      <width>144</width>
      <height>52</height>
    </image>


  <?

      $latestnews = mysql_query("SELECT myevents.*, myevents_dates.datefrom from myevents, myevents_dates WHERE myevents_dates.datefrom >= CURDATE() AND myevents.id = myevents_dates.eventid order by myevents_dates.datefrom");
          while ($news = mysql_fetch_assoc($latestnews)) {

              $datetime = explode(" ",$news[datefrom]);
              $date = explode("-",$datetime[0]);
              $time = explode(":",$datetime[1]);
              $news[description] = strip_tags($news[description]);
              $news[description] = htmlspecialchars($news[description]);

              echo "<item>";
              echo "<title>".mb_convert_encoding(htmlspecialchars($news[title]),"US-ASCII")."</title>";
              echo "<description>".mb_convert_encoding(substr($news[description],0, 250),"US-ASCII")."</description>";
              echo "<link>http://www.mediwales.com/index.php?id=4&amp;nid=$news[id]</link>";
              echo "<pubDate>".date('D, d M Y H:i:s O', mktime($time[0],$time[1],$time[2],$date[1],$date[2],$date[0]))."</pubDate>";
              echo "</item>";


          }

  ?>
  </channel>  
</rss>

Upvotes: 0

Views: 341

Answers (4)

Rob
Rob

Reputation: 6370

I've temporarily solved the problem by removing some html characters on my actual website so the feed isn't grabbing them.

I know the problem may arise when we grab the next set of feeds but too rushed to fix at the moment.

Upvotes: 0

joseignaciorc
joseignaciorc

Reputation: 334

Notice that the only error is in the line 56:

nbsp;&</description>

should be:

nbsp;&amp;</description>

The problem is that you are calling htmlspecialchars and then substr, so the last & gets truncated to &, and that makes your feed invalid. Call substr first and htmlspecialchars last, to fix this.

The other things ("Email address is missing real name", "item should contain a guid element") are just recomendations: you should follow them because they are good ideas, but they would caise the feed to fail the validation.

Upvotes: 2

Dave Winer
Dave Winer

Reputation: 1917

I think they're complaining about the fact that you're using a date that's in the future.

If so, that is not, imho, a reason to declare your feed invalid. Real-world publications often have publication dates in the future.

The spec, which is the actual authority on this doesn't say there's anything wrong with pubdates in the future.

http://cyber.law.harvard.edu/rss/rss.html

Validators can have bugs too. :-)

Upvotes: 0

Joe White
Joe White

Reputation: 97656

There are a number of other errors you'll need to fix (like cutting off in the middle of an HTML entity). But they provide a Help link for each one.

In specific reference to the date error, if you follow their Help link, you'll see that one of the possible reasons for this warning is that a date is in the future. The date they're complaining about is "Implausible date: Mon, 07 Mar 2011 00:00:00 +0000". Today is 1 Mar 2011, so 7 Mar 2011 is indeed in the future.

If you continue reading their Help link, they explain why this is a problem. The fix is not to include future dates in your feed.

Upvotes: 0

Related Questions