Reputation: 73
I've read all previous questions about web scraping in R but couldn't solve my problem. I want to get names of the pictures, (see URL below) and detailed information about every picture.
I realize that have to use xpathSApply
and a loop to take info about every picture. But now i have problem even with taking the name of one from http://www.wikiart.org/en/search/monet/11
library(XML)
url = "http://www.wikiart.org/en/search/monet/1#supersized-search-211804"
doc = htmlTreeParse(url, useInternalNodes=T)
pictureName = xpathSApply(doc,"//a[contains(@href, 'title')]",xmlValue)
pictureName
## list()
Why does it give me list()
?
Upvotes: 0
Views: 750
Reputation: 269396
Try this:
pictureNames <- xpathSApply(doc,"//a[@class='big rimage']/@title", unname)
giving:
> head(pictureNames)
[1] "Camille and Jean Monet in the Garden at Argenteuil - Claude Monet"
[2] "Camille Monet at the Window, Argentuile - Claude Monet"
[3] "Camille Monet in the Garden - Claude Monet"
[4] "Camille Monet in the Garden at the House in Argenteuil - Claude Monet"
[5] "Camille Monet on a Garden Bench - Claude Monet"
[6] "Camille Monet On Her Deathbed - Claude Monet"
Upvotes: 2