Mateus Maciel
Mateus Maciel

Reputation: 161

Download multiple files from a url, using R

I have this url: https://www.cnpm.embrapa.br/projetos/relevobr/download/index.htm with geographic information about Brazilian states. If you click in any state, you will find these grids:

enter image description here

Now, if you click in any grid, you will be able to download the geographic information of this specific grid:

enter image description here

What I need: download all the grids at once. Is it possible?

Upvotes: 3

Views: 2682

Answers (1)

alistaire
alistaire

Reputation: 43334

You can scrape the page to get the URLs for the zip files, then iterate across the URLs to download everything:

library(rvest)

# get page source
h <- read_html('https://www.cnpm.embrapa.br/projetos/relevobr/download/mg/mg.htm')

urls <- h %>% 
    html_nodes('area') %>%    # get all `area` nodes
    html_attr('href') %>%    # get the link attribute of each node
    sub('.htm$', '.zip', .) %>%    # change file suffix
    paste0('https://www.cnpm.embrapa.br/projetos/relevobr/download/mg/', .)    # append to base URL

# create a directory for it all
dir <- file.path(tempdir(), 'mg')
dir.create(dir)

# iterate and download
lapply(urls, function(url) download.file(url, file.path(dir, basename(url))))

# check it's there
list.files(dir)
#>  [1] "sd-23-y-a.zip" "sd-23-y-b.zip" "sd-23-y-c.zip" "sd-23-y-d.zip" "sd-23-z-a.zip" "sd-23-z-b.zip"
#>  [7] "sd-23-z-c.zip" "sd-23-z-d.zip" "sd-24-y-c.zip" "sd-24-y-d.zip" "se-22-y-d.zip" "se-22-z-a.zip"
#> [13] "se-22-z-b.zip" "se-22-z-c.zip" "se-22-z-d.zip" "se-23-v-a.zip" "se-23-v-b.zip" "se-23-v-c.zip"
#> [19] "se-23-v-d.zip" "se-23-x-a.zip" "se-23-x-b.zip" "se-23-x-c.zip" "se-23-x-d.zip" "se-23-y-a.zip"
#> [25] "se-23-y-b.zip" "se-23-y-c.zip" "se-23-y-d.zip" "se-23-z-a.zip" "se-23-z-b.zip" "se-23-z-c.zip"
#> [31] "se-23-z-d.zip" "se-24-v-a.zip" "se-24-v-b.zip" "se-24-v-c.zip" "se-24-v-d.zip" "se-24-y-a.zip"
#> [37] "se-24-y-c.zip" "sf-22-v-b.zip" "sf-22-x-a.zip" "sf-22-x-b.zip" "sf-23-v-a.zip" "sf-23-v-b.zip"
#> [43] "sf-23-v-c.zip" "sf-23-v-d.zip" "sf-23-x-a.zip" "sf-23-x-b.zip" "sf-23-x-c.zip" "sf-23-x-d.zip"
#> [49] "sf-23-y-a.zip" "sf-23-y-b.zip" "sf-23-z-a.zip" "sf-23-z-b.zip" "sf-24-v-a.zip"

Upvotes: 5

Related Questions