Reputation: 47
I got json data like this:
"data": [
{
"currency": "CNY",
"standardDate": "2021-03-31T00:00:00+08:00",
"reportDate": "2021-04-26T00:00:00+08:00",
"reportType": "first_quarterly_report",
"q": {
"ps": {
"np": {
"t": 773183901
}
}
},
"date": "2021-03-31T00:00:00+08:00",
"stockCode": "300413"
}
],
"data":["Like Above"] many times
when I use pd.DataFrame to get a df
, the column 'q'
value shows this:
q
{'ps': {'np': {'t': 773183901}}}
{'ps': {'np': {'t': 773183901}}}
{'ps': {'np': {'t': 773183901}}}
{'ps': {'np': {'t': 773183901}}}
how can I only get the number cleverly in the 'q'
column ?
q
773183901
773183901
773183901
773183901
Upvotes: 1
Views: 72
Reputation: 23217
A more direct way of accessing the JSON / dict value by name is to use the .str[]
accessor. Like this using column q
to access value of t
deep down the nested JSON:
df['q'] = df['q'].str['ps'].str['np'].str['t']
data = {'q':[
{'ps': {'np': {'t': 773183901}}},
{'ps': {'np': {'t': 773183902}}},
{'ps': {'np': {'t': 773183903}}},
{'ps': {'np': {'t': 773183904}}}]}
df = pd.DataFrame(data)
print(df)
q
0 {'ps': {'np': {'t': 773183901}}}
1 {'ps': {'np': {'t': 773183902}}}
2 {'ps': {'np': {'t': 773183903}}}
3 {'ps': {'np': {'t': 773183904}}}
df['q'] = df['q'].str['ps'].str['np'].str['t']
print(df)
q
0 773183901
1 773183902
2 773183903
3 773183904
Upvotes: 1
Reputation: 825
The best way is to get the desired columns
df1 = pd.json_normalize(data, record_path=['q','ps','np'], \
meta=['currency','standardDate','reportDate','<All required fields>'])
Upvotes: 2