Web scraping with R: can't see the downloadable links

Question

I am trying to download some .xlsx files from this kind of webpage EDIT or this one. However, when I want to display the source code (right click --> view source code), I can't see all the content of the actual webpage (just the header and the footer).

I tried to use the rvest to display the downloadable links but same here, it returns only the ones from the header and the footer:

library(rvest)
html("https://b2share.eudat.eu/records/8d47a255ba5749e3ac169527e22f0068") %>% 
     html_nodes("a")

Returns:

#{xml_nodeset (5)}
#[1] Go to EUDAT website
#[2] 
#[3] Acceptable Use #Policy 
#[4] Data Privacy Statement
#[5] About EUDAT

Any idea how to access the content of the all page?

QHarr · Accepted Answer

You need to pass the record id to an API endpoint which provides the parts to construct the file download links as follows:

library(jsonlite)

d <- jsonlite::read_json('https://b2share.eudat.eu/api/records/8d47a255ba5749e3ac169527e22f0068')

files <- paste(d$links$files, d$files[[1]]$key , sep = '/')

For re-use, you can re-write as a function accepting the start link as argument:

library(jsonlite)
library(stringr)

get_links <- function(link){
  record_id <- tail(str_split(link, '/')[[1]], 1)
  d <- jsonlite::read_json(paste0('https://b2share.eudat.eu/api/records/', record_id))
  links <- paste(d$links$files, d$files[[1]]$key , sep = '/')
  return(links)
}

get_links('https://b2share.eudat.eu/records/ce32a67a789b44a1a15965fd28a8cb17')
get_links('https://b2share.eudat.eu/records/8d47a255ba5749e3ac169527e22f0068')

Which you could simplify to:

library(jsonlite)

get_links <- function(record_id){
  d <- jsonlite::read_json(paste0('https://b2share.eudat.eu/api/records/', record_id))
  links <- paste(d$links$files, d$files[[1]]$key , sep = '/')
  return(links)
}

get_links('ce32a67a789b44a1a15965fd28a8cb17')
get_links('8d47a255ba5749e3ac169527e22f0068')

Web scraping with R: can't see the downloadable links

Answers (1)

Related Questions

Web scraping with R: can&#39;t see the downloadable links

Answers (1)

Related Questions

Web scraping with R: can't see the downloadable links