Florian Seliger
Florian Seliger

Reputation: 441

Rvest only scrapes part of the table

I am new to Rvest. I am trying to scrape information about crypto currencies from this website: https://coinmarketcap.com/.

I am able to scrape all information on the first 10 currencies listed in the table, but for other currencies I only get the name and price. What's the reason? How can I scrape all information about all currencies?

My code:

library(rvest)
market <- as.data.frame(read_html('https://coinmarketcap.com/')  %>%
  html_table(fill = TRUE)) 
 

Upvotes: 0

Views: 45

Answers (1)

anpami
anpami

Reputation: 888

The webpage loads 'dynamically', and not at once, so you need to use RSelenium instead of rvest.

Does the following work?

url<- "https://coinmarketcap.com/"

# RSelenium with Firefox
rD <- RSelenium::rsDriver(browser="firefox", port=4546L, verbose=F)
remDr <- rD[["client"]]
remDr$navigate(url)
Sys.sleep(4)

# get the page source
web <- remDr$getPageSource()
web <- xml2::read_html(web[[1]])

table <- html_table(web, fill = TRUE) %>%
  as.data.frame()

# close RSelenium
remDr$close()
gc()
rD$server$stop()
system("taskkill /im java.exe /f", intern=FALSE, ignore.stdout=FALSE)

By the way, the webpage seems to have an API. It is possible that you could get the same data more efficiently through that API.

Upvotes: 1

Related Questions