Reputation: 145
I have the below code, that iterates through the df and creates an additional column new
with a dict of the other columns. Is there a better way to achieve this without using iterrows? My actual dataset is much larger and iterating through rows is not performant.
Code
import pandas as pd
import json
data = {'Name':['Tom', 'Nick', 'Jim', 'John'],
'Age':[20, 21, 35, 11]}
df = pd.DataFrame(data)
for i, row in df.iterrows():
df.loc[i, 'new'] = json.dumps(row.to_dict())
print(df)
Output
Name Age new
0 Tom 20 {"Name": "Tom", "Age": 20}
1 Nick 21 {"Name": "Nick", "Age": 21}
2 Jim 35 {"Name": "Jim", "Age": 35}
3 John 11 {"Name": "John", "Age": 11}
Upvotes: 1
Views: 42
Reputation: 75080
You can try df.to_dict
with df.join
out = df.join(pd.Series(df.to_dict('records'),index=df.index,name='new'))
print(out)
Name Age new
0 Tom 20 {'Name': 'Tom', 'Age': 20}
1 Nick 21 {'Name': 'Nick', 'Age': 21}
2 Jim 35 {'Name': 'Jim', 'Age': 35}
3 John 11 {'Name': 'John', 'Age': 11}
Upvotes: 2