Bogaso
Bogaso

Reputation: 3318

Non-selenium way to download data from Web using R/Python

I am looking for some non-Selenium way to mine data from a Website using R (preferably) or Python.

In R I used below code to do the same-

library(rvest)
library(XML)
Link = 'https://www.bseindia.com/stock-share-price/itc-ltd/itc/500875/'
read_html(Link) %>% html_nodes(".textvalue .ng-binding") %>% html_text()
## character(0)

Ideally I should be able to get most of the numerical values. But as you see it could not be able to download anything. Any pointer towards the right approach will be highly beneficial.

I also tried with BeautifulSoup module from Python as below without any success-

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
uClient = uReq("https://www.bseindia.com/stock-share-price/itc-ltd/itc/500875/")
page_html = uClient.read()
page_soup = soup(page_html, 'html.parser')
page_soup.findAll("div", {"class":"textvalue.ng-binding"})

Thanks,

Upvotes: 1

Views: 219

Answers (1)

QHarr
QHarr

Reputation: 84475

This is easy as you can use the API the page uses. The return json has all the values but I am printing only one.

Python:

import requests

r = requests.get('https://api.bseindia.com/BseIndiaAPI/api/StockTrading/w?flag=&quotetype=EQ&scripcode=500875').json()
print(r['MktCapFF'])

R:

library(rvest)
library(jsonlite)

r <- read_html('https://api.bseindia.com/BseIndiaAPI/api/StockTrading/w?flag=&quotetype=EQ&scripcode=500875') %>%html_text() %>%jsonlite::fromJSON(.)
print(r$MktCapFull)

Upvotes: 1

Related Questions