Reputation: 11
I can't get one javascript table with BueatifulSoup, returning empty array
I tried to get data from this page. https://www.hkex.com.hk/Mutual-Market/Stock-Connect/Statistics/Historical-Daily?sc_lang=en#select4=1&select5=2&select3=0&select2=3&select1=24
import requests, json
text = requests.get("https://www.hkex.com.hk/Mutual-Market/Stock-Connect/Statistics/Historical-Daily?sc_lang=en#select4=0&select5=2&select3=0&select2=3&select1=24")
data = json.loads(text)
print(data['Scty'])
Upvotes: 0
Views: 98
Reputation: 84475
There is another url you can use - found by looking at the network tab. A little string manipulation on the response text and you have a string that can be loaded with json
and contains everything on the page (including for all 4 drop down geographies). There is no need for bs4. You can extract everything you want with json
library.
Explore it here.
import requests
import json
r = requests.get('https://www.hkex.com.hk/eng/csm/DailyStat/data_tab_daily_20190425e.js?_=1556252093686')
data = json.loads(r.text.replace('tabData = ',''))
For example, path to first row of table on landing page:
Upvotes: 1