Vitalij Ramich
Vitalij Ramich

Reputation: 31

web scraping in r (with loop)

I need to scrapes data from this link and save table in csv. What I have now: I can scrap using rvest the first page, second page and save these tables using this code:

library(rvest)
webpage <- read_html("https://bra.areacodebase.com/number_type/M?page=0")
data <- webpage %>%
  html_nodes("table") %>%
  .[[1]] %>% 
  html_table()
 url<- "https://bra.areacodebase.com/number_type/M?page=0"
webpage2<- html_session(url) %>% follow_link(css = ".pager-next a")
data2 <- webpage %>%
 html_nodes("table") %>%
 .[[1]] %>%
  html_table()
data_all <- rbind(data, data2)
write.table(data_all, "df_data.csv", sep = ";", na = "", quote = FALSE, row.names = FALSE)

#result<- lapply(webpage, %>% follow_link(css = ".pager-next a"))
#data_all <- rbind(data:data2)

However, I can't figure out how to run loop.

Upvotes: 3

Views: 4528

Answers (1)

m0nhawk
m0nhawk

Reputation: 24168

You can either go to the next link with follow_link or get the page via URL directly:

webpage <- "https://bra.areacodebase.com/number_type/M?page=0"

for(i in 2:5089) {
  data <- read_html(webpage) %>%
    html_nodes("table") %>%
    .[[1]] %>% 
    html_table()

  webpage <- html_session(webpage) %>% follow_link(css = ".pager-next a") %>% .[["url"]]
}

Or, with direct URL:

for(i in 0:5089) {
  webpage <- read_html(paste0("https://bra.areacodebase.com/number_type/M?page=", i))
  data <- webpage %>%
    html_nodes("table") %>%
    .[[1]] %>% 
    html_table()
}

Upvotes: 4

Related Questions