web scraping in R using xpathSApply

Question

I've read all previous questions about web scraping in R but couldn't solve my problem. I want to get names of the pictures, (see URL below) and detailed information about every picture. I realize that have to use xpathSApply and a loop to take info about every picture. But now i have problem even with taking the name of one from http://www.wikiart.org/en/search/monet/11

    library(XML)
    url = "http://www.wikiart.org/en/search/monet/1#supersized-search-211804"
    doc = htmlTreeParse(url, useInternalNodes=T)
    pictureName = xpathSApply(doc,"//a[contains(@href, 'title')]",xmlValue)
    pictureName
    ## list()

Why does it give me list()?

G. Grothendieck · Accepted Answer

Try this:

pictureNames <- xpathSApply(doc,"//a[@class='big rimage']/@title", unname)

giving:

> head(pictureNames)
[1] "Camille and Jean Monet in the Garden at Argenteuil - Claude Monet"    
[2] "Camille Monet at the Window, Argentuile - Claude Monet"               
[3] "Camille Monet in the Garden - Claude Monet"                           
[4] "Camille Monet in the Garden at the House in Argenteuil - Claude Monet"
[5] "Camille Monet on a Garden Bench - Claude Monet"                       
[6] "Camille Monet On Her Deathbed - Claude Monet"

web scraping in R using xpathSApply

Answers (1)

Related Questions