vervit khandelwal
vervit khandelwal

Reputation: 39

Convert multi nested Json to Pandas DataFrame

{
"ABC": {
"A": {
"Date": "01/01/2021",
"Value": "0.09"
},
"B": {
"Date": "01/01/2021",
"Value": "0.001"
}
},
"XYZ": {
"A": {
"Date": "01/01/2021",
"Value": "0.006"
},
"B": {
"Date": "01/01/2021",
"Value": "0.000"
}
}
}

Current output after applying pd.json_normalize(x,max_level=1)

enter image description here

Expected Output : Need to Convert this to pandas DataFrame

enter image description here

If any one can help or give some advice on working with this data that would be great!

Upvotes: 0

Views: 80

Answers (2)

Valdi_Bo
Valdi_Bo

Reputation: 30971

One of possible options is custom processing of your x object, creating a list of rows:

lst = []
for k1, v1 in x.items():
    row = {}
    row['key'] = k1
    for k2, v2 in v1.items():
        dd = v2["Date"]
        vv = float(v2["Value"])
        row['Date'] = dd
        row[k2] = vv
    lst.append(row)

Note that the above code also converts Value to float type.

I assumed that all dates in each first-level object are the same, so in the second level loop Date is overwritten, but I assume that this does no harm.

Then you can create the output DataFrame as follows:

df = pd.DataFrame(lst)
df.set_index('key', inplace=True)
df.index.name = None

The result is:

           Date      A      B
ABC  01/01/2021  0.090  0.001
XYZ  01/01/2021  0.006  0.000

Although it is possible to read x using json_normalize into a temporary DataFrame, the sequence of operations to convert it to your desired shape would be complicated.

This is why I came up with the above, in my opinion conceptually simpler solution.

Upvotes: 3

keramat
keramat

Reputation: 4543

Use the following while the js is your input dict:

s = pd.DataFrame(js)
ss = s.apply(lambda x: [pd.Series(y)['Value'] for y in x])
ss['Date'] = s['ABC'].apply(pd.Series)['Date']

result:

enter image description here

Upvotes: 3

Related Questions