Web-scraping a javascript table with python BueatifulSoup

Question

I can't get one javascript table with BueatifulSoup, returning empty array

I tried to get data from this page. https://www.hkex.com.hk/Mutual-Market/Stock-Connect/Statistics/Historical-Daily?sc_lang=en#select4=1&select5=2&select3=0&select2=3&select1=24

import requests, json
text = requests.get("https://www.hkex.com.hk/Mutual-Market/Stock-Connect/Statistics/Historical-Daily?sc_lang=en#select4=0&select5=2&select3=0&select2=3&select1=24")
data = json.loads(text)

print(data['Scty'])

QHarr · Accepted Answer

There is another url you can use - found by looking at the network tab. A little string manipulation on the response text and you have a string that can be loaded with json and contains everything on the page (including for all 4 drop down geographies). There is no need for bs4. You can extract everything you want with json library.

Explore it here.

import requests
import json

r = requests.get('https://www.hkex.com.hk/eng/csm/DailyStat/data_tab_daily_20190425e.js?_=1556252093686')
data = json.loads(r.text.replace('tabData = ',''))

For example, path to first row of table on landing page:

Web-scraping a javascript table with python BueatifulSoup

Answers (1)

Related Questions