kittygirl
kittygirl

Reputation: 2443

How to iterate json column to columns and then append origin dataframe?

import pandas as pd
inp = [{'c1':10,'cols':{'c2':20,'c3':'str1'}, 'c4':'41'}, {'c1':11,'cols':{'c2':20,'c3':'str2'},'c4':'42'}, {'c1':12,'cols':{'c2':20,'c3':'str3'},'c4':'43'}]
df = pd.DataFrame(inp)
print (df)

The df is:

   c1  c4                      cols
0  10  41  {'c2': 20, 'c3': 'str1'}
1  11  42  {'c2': 20, 'c3': 'str2'}
2  12  43  {'c2': 20, 'c3': 'str3'}

The cols column is JSON type.

I need to make cols column to json_decode,which means change df to:

   c1    c4    c2      c3
0  10    41    20      str1
1  11    42    20      str2
2  12    43    20      str3

How to do it?
Thanks in advance!

Upvotes: 2

Views: 180

Answers (3)

rafaelc
rafaelc

Reputation: 59274

Use pd.io.json.json_normalize

pd.io.json.json_normalize(inp)

Outputs

    c1  c4  cols.c2 cols.c3
0   10  41  20      str1
1   11  42  20      str2
2   12  43  20      str3

If you have a pd.DataFrame, convert back using to_dict

pd.io.json.json_normalize(df.to_dict('records'))

Upvotes: 2

heena bawa
heena bawa

Reputation: 828

You can use:

df = df.join(pd.DataFrame.from_dict(df['cols'].tolist()))
df.drop('cols', axis=1, inplace=True)
print(df)

Output:

   c1  c4  c2    c3
0  10  41  20  str1
1  11  42  20  str2
2  12  43  20  str3

Upvotes: 0

jezrael
jezrael

Reputation: 862681

Use DataFrame.pop for extract column, convert to numpy array and lists and pass to DataFrame constructor, last DataFrame.join to original:

df = df.join(pd.DataFrame(df.pop('cols').values.tolist(), index=df.index))
print (df)
   c1  c4  c2    c3
0  10  41  20  str1
1  11  42  20  str2
2  12  43  20  str3

Upvotes: 1

Related Questions