Reputation: 924
I am accessing a json data and want to convert it in pandas dataframe.
Unfortunately, an error occurred when json.loads(req.text)
ValueError: No JSON object could be decoded
Below is my code.
HEADERS = {"User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36",
"Origin": "https://www.idx.co.id"}
req = requests.get("https://www.idx.co.id/Portals/0/StaticData/HomeHtml/data.js",
headers=HEADERS)
stocks = json.loads(req.text)
columns = ['code', 'name']
df = pd.DataFrame([{k: v for k,v in d.items() if k in columns}
for d in stocks, columns = columns)
Upvotes: 0
Views: 83
Reputation: 881
You are not actually receiving a JSON, but a Javascript file. Applying a simple regular expression matching all the data between []
you can achieve the desired result.
import requests
import json
import re
req = requests.get("https://www.idx.co.id/Portals/0/StaticData/HomeHtml/data.js")
content = re.findall(r"= (\[.*?\]);", req.text)
data = json.loads(content[0])
print(data)
Edit: an useful website to test python regexp is https://pythex.org/
Upvotes: 1