MB17
MB17

Reputation: 25

How to save the results from a for-loop when getting html websites in R?

I was wondering how to store and retrieve the data from a for loop when aiming to scrape multiple websites in R.

library(rvest)
library(dplyr)
library(tidyverse)
library(glue)

cont<-rep(NA,101)

countries <- c("au","at","de","se","gb","us")

for (i in countries) {
sides<-glue("https://www.beeradvocate.com/beer/top-rated/",i,.sep = "") 
html <- read_html(sides)
cont[i] <- html %>% 
  html_nodes("table") %>% html_table()
}

table_au <- cont[2] [[1]]

The idea is to get a list for each website respectively. If I ran my code, table_au will just show me NA, presumably because the loop results are not stored.

It would be awesome, if someone could help me.

BR,

Marco

Upvotes: 1

Views: 43

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389175

We can extract all the tables in a list.

library(rvest)

url <- "https://www.beeradvocate.com/beer/top-rated/"
temp <- purrr::map(paste0(url, countries), ~{
          .x %>% 
           read_html() %>%
           html_nodes("table") %>% 
           html_table(header = TRUE) %>% .[[1]]
})

If you want data as different dataframes like tab_au, tab_at, we can name the list and use list2env to get data separately.

names(temp) <- paste0('tab_', countries)
list2env(temp, .GlobalEnv)

Upvotes: 1

Related Questions