Run through a few URLs, and import data from each

Question

I am trying to figure out how to loop through a few URLs. This is just a learning exercise for myself. I thought I basically knew how to do this, but I have become stuck. I believe my code below is close, but it’s not incrementing for some reason scraping anything.

library(rvest)
URL <- "https://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw=mens%27s+shoes+size+11&rt=nc"
WS <- read_html(URL)
URLs <- WS %>% html_nodes("ResultSetItems") %>% html_attr("href") %>% as.character()

Basically, I went to ebay, entered a simple search term, found a key node, named ‘ResultSetItems’, and tried to scrape the items from that. Nothing happened. Also, I’m trying to figure out how to increment through let’s say 5 URLs, and apply the same logic. The URLs would look like this:

'https://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw=mens%27s+shoes+size+11&_pgn=1&_skc=0&rt=nc'              
    
'https://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw=mens%27s+shoes+size+11&_pgn=2&_skc=0&rt=nc'

'https://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw=mens%27s+shoes+size+11&_pgn=3&_skc=0&rt=nc'

'https://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw=mens%27s+shoes+size+11&_pgn=4&_skc=0&rt=nc'

'https://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw=mens%27s+shoes+size+11&_pgn=5&_skc=0&rt=nc'

I think the code should look something like this:

for(i in 1:5) 
{
  
   site <- paste("https://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw=mens%27s+shoes+size+11&_pgn=",i,"&_skc=0&rt=nc", jump, sep="")
   dfList <- lapply(site, function(i) {
       WS <- read_html(i)
       URLs <- WS %>% html_nodes("ResultSetItems") %>% html_attr("href") %>% as.character()
   })
}
finaldf <- do.call(rbind, webpage)

I can’t seem to get this working. I may be over-simplifying things.

Run through a few URLs, and import data from each

Answers (1)

Related Questions