Parse RSS feed using XML packagin R

Question

I am trying to scrape and parse the following RSS feed http://www.huffingtonpost.com/rss/liveblog/liveblog-1213.xml I have looked at other queries with respect to R and XML and have been unable to make any progress on my problem. The xml code for each entry

        
     <![CDATA[Five Rockets Intercepted By Iron Drone Systems Over Be'er Sheva]]>
     http://www.huffingtonpost.co.uk/2012/11/15/tel-aviv-gaza-rocket_n_2138159.html#2_five-rockets-intercepted-by-iron-drone-systems-over-beer-sheva
     Haaretz reports that five more rockets intercepted by Iron Dome systems over Be'er Sheva. In total, there have been 274 rockets fired and 105 intercepted. The IDF has attacked 250 targets in Gaza.]]>
     http://www.huffingtonpost.co.uk/2012/11/15/tel-aviv-gaza-rocket_n_2138159.html#2_five-rockets-intercepted-by-iron-drone-systems-over-beer-sheva
     2012-11-15T12:56:09-05:00
     Huffingtonpost.com

For each entry/post I want to record "Date" (pubDate), "Title" (title), "Description" (full text cleaned). I have tried to use the xml package in R, but confess I am a bit of a newbie (little to no experience working with XML, but some R experience). The code I am working off of, and getting nowhere with is:

 library(XML)

 xml.url <- "http://www.huffingtonpost.com/rss/liveblog/liveblog-1213.xml"

 # Use the xmlTreePares-function to parse xml file directly from the web

 xmlfile <- xmlTreeParse(xml.url)

# Use the xmlRoot-function to access the top node

xmltop = xmlRoot(xmlfile)

xmlName(xmltop)

names( xmltop[[ 1 ]] )

  title          link   description      language     copyright 
  "title"        "link" "description"    "language"   "copyright" 
 category     generator          docs          item          item 
  "category"   "generator"        "docs"        "item"        "item"

However, whenever I try to manipulate and try to manipulate the "title", or "description" information, I continually get errors. Any help troubleshooting this code, would be most appreciated.

Thanks, Thomas

Parse RSS feed using XML packagin R

Answers (1)

Related Questions