Metrics
Metrics

Reputation: 15458

Webscraping the data using R

Aim: I am trying to scrape the historical daily stock price for all companies from the webpage http://www.nepalstock.com/datanepse/previous.php. The following code works; however, it always generates the daily stock price for the most recent (Feb 5, 2015) date only. In another words, output is the same, irrespective of the date that I entered. I would appreciate if you could help in this regard.

  library(RHTMLForms)
    library(RCurl)
    library(XML)
    url <- "http://www.nepalstock.com/datanepse/previous.php"
    forms <- getHTMLFormDescription(url)

    # we are interested in the second list with date forms
    # forms[[2]]
    # HTML Form: http://www.nepalstock.com/datanepse/ 
    #   Date: [  ]

    get_stock<-createFunction(forms[[2]])

#create sequence of dates from start to end and store it as a list

    date_daily<-as.list(seq(as.Date("2011-08-24"), as.Date("2011-08-30"), "days"))

# determine the number of elements in the list

    num<-length(date_daily)

    daily_1<-lapply(date_daily,function(x){
      show(x) #displays the particular date
      readHTMLTable(htmlParse(get_stock(Date = x)), which = 7)

    })


 #18 tables out of which 7 is one what we desired

# change the colnames 

    col_name<-c("SN","Traded_Companies","No_of_Transactions","Max_Price","Min_Price","Closing_Price","Total_Share","Amount","Previous_Closing","Difference_Rs.")
    daily_2<-lapply(daily_1,setNames,nm=col_name)

Output:
> head(daily_2[[1]],5)
 SN                                   Traded_Companies No_of_Transactions Max_Price Min_Price Closing_Price Total_Share    Amount
1  1                  Agricultural Development Bank Ltd                 24       489       471           473       2,868 1,359,038
2  2 Arun Valley Hydropower Development Company Limited                 40       365       360           362       8,844 3,199,605
3  3                    Alpine Development Bank Limited                 11       297       295           295         150    44,350
4  4                   Asian Life Insurance Co. Limited                 10     1,230     1,215         1,225         898 1,098,452
5  5                         Apex Development Bank Ltd.                 23       131       125           131       6,033   769,893
  Previous_Closing Difference_Rs.
1              480             -7
2              363             -1
3              303             -8
4            1,242            -17
5              132             -1
> tail(daily_2[[1]],5)
     SN                 Traded_Companies No_of_Transactions Max_Price Min_Price Closing_Price Total_Share    Amount Previous_Closing
140 140               United Finance Ltd                  4       255       242           242         464   115,128              255
141 141  United Insurance Co.(Nepal)Ltd.                  3       905       905           905         234   211,770              915
142 142         Vibor Bikas Bank Limited                  7       158       152           156         710   109,510              161
143 143 Western Development Bank Limited                 35       320       311           313       7,631 2,402,497              318
144 144    Yeti Development Bank Limited                 22       139       132           139      14,355 1,921,511              134
    Difference_Rs.
140            -13
141            -10
142             -5
143             -5
144              5

Upvotes: 1

Views: 144

Answers (1)

hadley
hadley

Reputation: 103898

Here's one quick approach. Note that the site uses a POST request to send the date to the server.

library(rvest)
library(httr)

page <- "http://www.nepalstock.com/datanepse/previous.php" %>% 
  POST(body = list(Date = "2015-02-01")) %>% 
  html()

page %>%
  html_node(".dataTable") %>%
  html_table(header = TRUE) 

Upvotes: 3

Related Questions