Youyou Hua
Youyou Hua

Reputation: 47

how to get the data from a json format by using Pandas?

I got json data like this:

     "data": [
        {
          "currency": "CNY",
          "standardDate": "2021-03-31T00:00:00+08:00",
          "reportDate": "2021-04-26T00:00:00+08:00",
          "reportType": "first_quarterly_report",
          "q": {
            "ps": {
              "np": {
                "t": 773183901
              }
            }
          },
          "date": "2021-03-31T00:00:00+08:00",
          "stockCode": "300413"
        }
      ],
    "data":["Like Above"] many times

when I use pd.DataFrame to get a df, the column 'q' value shows this:

     q
     {'ps': {'np': {'t': 773183901}}} 
     {'ps': {'np': {'t': 773183901}}} 
     {'ps': {'np': {'t': 773183901}}} 
     {'ps': {'np': {'t': 773183901}}} 

how can I only get the number cleverly in the 'q' column ?

    q
    773183901
    773183901
    773183901
    773183901

Upvotes: 1

Views: 72

Answers (2)

SeaBean
SeaBean

Reputation: 23217

A more direct way of accessing the JSON / dict value by name is to use the .str[] accessor. Like this using column q to access value of t deep down the nested JSON:

df['q'] = df['q'].str['ps'].str['np'].str['t']

Demo

data = {'q':[
     {'ps': {'np': {'t': 773183901}}}, 
     {'ps': {'np': {'t': 773183902}}}, 
     {'ps': {'np': {'t': 773183903}}}, 
     {'ps': {'np': {'t': 773183904}}}]}

df = pd.DataFrame(data)

print(df)

                                  q
0  {'ps': {'np': {'t': 773183901}}}
1  {'ps': {'np': {'t': 773183902}}}
2  {'ps': {'np': {'t': 773183903}}}
3  {'ps': {'np': {'t': 773183904}}}


df['q'] = df['q'].str['ps'].str['np'].str['t']

print(df)

           q
0  773183901
1  773183902
2  773183903
3  773183904

Upvotes: 1

Pawan Jain
Pawan Jain

Reputation: 825

The best way is to get the desired columns

df1 = pd.json_normalize(data, record_path=['q','ps','np'],  \
    meta=['currency','standardDate','reportDate','<All required fields>'])

Upvotes: 2

Related Questions