Read dynamic webpage html in either Python or R

Question

I am trying to automate the process of scraping the tables of webpages like Investing.com Economic Calendar which is fairly straightforward with R if we are only interested in the default tab, which displays the calendar for today. Here is the R code:

library(rvest)
library(dplyr)

Econ_webpage <- read_html("https://www.investing.com/economic-calendar/")

Indicators  <- Econ_webpage %>% html_nodes("#economicCalendarData") %>% 
html_table(fill = TRUE)  %>% .[[1]] %>% .[-(1:3),-  c(match("Imp.",colnames(.)),ncol(.))]

which produces the desired result displayed below.

> head(Indicators)
   Time Cur.                             Event Actual Forecast Previous 
4 19:50  JPY           BoJ Summary of Opinions                          
5 19:50  JPY              Exports (YoY)  (Feb)            1.9%    12.3% 
6 19:50  JPY              Imports (YoY)  (Feb)           17.1%     7.9% 
7 19:50  JPY              Trade Balance  (Feb)           -100B    -944B 
8 20:01  GBP Rightmove House Price Index (MoM)                     0.8% 
9 21:30  CNY         House Prices (YoY)  (Feb)                     5.0%

However, if I want to scrape the table in the tab Tomorrow I need to use the Selenium driver. I have tried RSelenium, but can not get it to work on my machine, so I have tried Selenium in Python. I use the following code in Python:

import selenium
from selenium import webdriver 

driver.Chrome(executable_path=PATH_TO_CHROMEDRIVER)
driver.get("https://www.investing.com/economic-calendar/")
driver.find_element_by_id("timeFrame_tomorrow").click()
html = driver.page_source

Now I have the html containing the desired table data within a string, which I simply don't know how to efficiently pars to produce the result of the R code. Can I somehow call rpy2 package, which allows for R code within Python or someone else knows an easier way to extract the table in the same form as above? How do I parse this html string?

Read dynamic webpage html in either Python or R

Answers (1)

Related Questions