Reputation: 1166
I'm generally not struggling with scraping tables from the web but for some reason when i try to scrape the historical data from the following page I don't manage to select the wanted table.
This is the link and my code
library(tidyverse)
library(rvest)
url <-read_html("https://coinmarketcap.com/currencies/bitcoin/historical-data/")
table <- url %>%
html_table() %>% .[[1]] %>% as.data.frame()
thanks
Upvotes: 1
Views: 1922
Reputation: 4658
The data is well hidden in a script
element on the page. It is loaded into a table
dynamically via JavaScript, which is why you can't find it.
The following extracts the data from that script
element (with ID __NEXT_DATA__
).
library(tidyverse)
library(rvest)
url <-read_html("https://coinmarketcap.com/currencies/bitcoin/historical-data/")
table <- url %>%
html_node("#__NEXT_DATA__") %>%
html_text() %>%
jsonlite::fromJSON()
table$props$initialState$cryptocurrency$ohlcvHistorical[[1]]$quotes
which gives
time_open time_close time_high time_low quote.USD.open quote.USD.high
1 2020-10-10T00:00:00.000Z 2020-10-10T23:59:59.999Z 2020-10-10T03:16:44.000Z 2020-10-10T00:01:41.000Z 11059.14 11442.21
2 2020-10-11T00:00:00.000Z 2020-10-11T23:59:59.999Z 2020-10-11T15:31:43.000Z 2020-10-11T00:52:06.000Z 11296.08 11428.81
...
Upvotes: 2