How to scrape fundamentals data of NSE indices (NIFTY 50) using R

Question

I am trying to scrape fundamentals data table (pe ratio, pb ratio and dividend yield) from nse website (link). I tried the following from rvest package:

url = "https://www1.nseindia.com/products/content/equities/indices/historical_pepb.htm"
pgsession <-html_session(url)

But, I receive this error:

Error in curl::curl_fetch_memory(url, handle = handle) :
LibreSSL SSL_read: SSL_ERROR_SYSCALL, errno 60

Also, I tried the httr package (css selectors identified using Chrome extension 'SelectorGadget')

fd <- list(submit = "Get Data", # Not Sure if it's the correct css selector 
IndexName = "NIFTY 50", 
fromDate = "01-06-2020", 
toDate = "15-06-2020" ) 

resp<-POST(url, body=fd, encode="form")

But, I receive the same error. I have scanned many forums for troubleshooting the problem but, it seems the website is blocking scraping attempts. Can someone validate this or provide a way to scraping the table from this website?

Bas · Accepted Answer

If you right-click the page, click 'Inspect element', and go to the 'Network' tab, you can see the request being made when you click the 'Get data' button.

In this case, the request is to the below URL, which can be easily read and parsed into a data frame using for example rvest::html_table().

By changing the URL I'm positive you can extract the table you want.

url <- "https://www1.nseindia.com/products/dynaContent/equities/indices/historical_pepb.jsp?indexName=NIFTY%2050&fromDate=01-06-2020&toDate=02-06-2020&yield1=undefined&yield2=undefined&yield3=undefined&yield4=all"

rvest::html_table(xml2::read_html(url))[[1]]

gives

  Historical NIFTY 50  P/E, P/B & Div. Yield values Historical NIFTY 50  P/E, P/B & Div. Yield values
1           For the period 01-06-2020 to 02-06-2020           For the period 01-06-2020 to 02-06-2020
2                                              Date                                               P/E
3                                       01-Jun-2020                                             22.96
4                                       02-Jun-2020                                             23.31
5                       Download file in csv format                       Download file in csv format
  Historical NIFTY 50  P/E, P/B & Div. Yield values Historical NIFTY 50  P/E, P/B & Div. Yield values
1           For the period 01-06-2020 to 02-06-2020           For the period 01-06-2020 to 02-06-2020
2                                               P/B                                         Div Yield
3                                              2.80                                              1.55
4                                              2.84                                              1.53
5                       Download file in csv format                       Download file in csv format

How to scrape fundamentals data of NSE indices (NIFTY 50) using R

Answers (2)

Related Questions