Reputation: 25
I was wondering how to store and retrieve the data from a for loop when aiming to scrape multiple websites in R.
library(rvest)
library(dplyr)
library(tidyverse)
library(glue)
cont<-rep(NA,101)
countries <- c("au","at","de","se","gb","us")
for (i in countries) {
sides<-glue("https://www.beeradvocate.com/beer/top-rated/",i,.sep = "")
html <- read_html(sides)
cont[i] <- html %>%
html_nodes("table") %>% html_table()
}
table_au <- cont[2] [[1]]
The idea is to get a list for each website respectively. If I ran my code, table_au will just show me NA, presumably because the loop results are not stored.
It would be awesome, if someone could help me.
BR,
Marco
Upvotes: 1
Views: 43
Reputation: 389175
We can extract all the tables in a list.
library(rvest)
url <- "https://www.beeradvocate.com/beer/top-rated/"
temp <- purrr::map(paste0(url, countries), ~{
.x %>%
read_html() %>%
html_nodes("table") %>%
html_table(header = TRUE) %>% .[[1]]
})
If you want data as different dataframes like tab_au
, tab_at
, we can name the list and use list2env
to get data separately.
names(temp) <- paste0('tab_', countries)
list2env(temp, .GlobalEnv)
Upvotes: 1