Reputation: 73
I used BeautifulSoup to get this data. It seems like a nested dictionary but I failed to transform them into dataframe. The type is .
{"page":1,"rows":[{"id":"160128","cell":{"fund_id":"160128","bond_ratio":"132.04","report_dt":"2019-12-31","is_outdate":false,"maturity_dt_tips":""}},{"id":"160130","cell":{"fund_id":"160130","bond_ratio":"165.29","report_dt":"2019-12-31","is_outdate":false,"maturity_dt_tips":""}},{"id":"160131","cell":{"fund_id":"160131","bond_ratio":"94.93","report_dt":"2019-12-31","is_outdate":false,"maturity_dt_tips":""}}],"total":3}
How should I get 'values' under each key of 'cell'? Thanks.
fund_id bond_ratio report_dt is_outdate maturity_dt_tips
0 160128 132.04 2019-12-31 false
1 160130 165.29 2019-12-31 false
2 160131 94.93 2019-12-31 false
Upvotes: 2
Views: 292
Reputation: 863351
Use json.json_normalize
:
d = {"page":1,"rows":[{"id":"160128","cell":{"fund_id":"160128","bond_ratio":"132.04","report_dt":"2019-12-31","is_outdate":False,"maturity_dt_tips":""}},{"id":"160130","cell":{"fund_id":"160130","bond_ratio":"165.29","report_dt":"2019-12-31","is_outdate":False,"maturity_dt_tips":""}},{"id":"160131","cell":{"fund_id":"160131","bond_ratio":"94.93","report_dt":"2019-12-31","is_outdate":False,"maturity_dt_tips":""}}],"total":3}
from pandas.io.json import json_normalize
df = json_normalize(d['rows'])
print (df)
id cell.fund_id cell.bond_ratio cell.report_dt cell.is_outdate \
0 160128 160128 132.04 2019-12-31 False
1 160130 160130 165.29 2019-12-31 False
2 160131 160131 94.93 2019-12-31 False
cell.maturity_dt_tips
0
1
2
Then if necessary remove values before .
in columns names add str.split
and indexing last values of lists by [-1]
:
df.columns = df.columns.str.split('.').str[-1]
print (df)
id fund_id bond_ratio report_dt is_outdate maturity_dt_tips
0 160128 160128 132.04 2019-12-31 False
1 160130 160130 165.29 2019-12-31 False
2 160131 160131 94.93 2019-12-31 False
Upvotes: 3