Aggregate dataframe rows into a dictionary

Question

I have a pandas DataFrame object where each row represents one object in an image.

One example of a possible row would be:

{'img_filename': 'img1.txt', 'img_size':'20', 'obj_size':'5', 'obj_type':'car'}

I want to aggregate all the objects that belong to the same image, and get something whose rows would be like:

{'img_filename': 'img1.txt', 'img_size':'20', 'obj': [{'obj_size':'5', 'obj_type':'car'}, {{'obj_size':'6', 'obj_type':'bus'}}]}

That is, the third column is a list of columns containing the data of each group.

How can I do this?

EDIT:

Consider the following example.

import pandas as pd
df1 = pd.DataFrame([
{'img_filename': 'img1.txt', 'img_size':'20', 'obj_size':'5', 'obj_type':'car'}, 
{'img_filename': 'img1.txt', 'img_size':'20', 'obj_size':'6', 'obj_type':'bus'}, 
{'img_filename': 'img2.txt', 'img_size':'25', 'obj_size':'4', 'obj_type':'car'}
])

df2 = pd.DataFrame([
{'img_filename': 'img1.txt', 'img_size':'20', 'obj': [{'obj_size':'5', 'obj_type':'car'}, {'obj_size':'6', 'obj_type':'bus'}]},
{'img_filename': 'img2.txt', 'img_size':'25', 'obj': [{'obj_size':'4', 'obj_type':'car'}]}
])

I want to turn df1 into df2.

Abhi · Accepted Answer

One way using to_dict

df2 = df1.groupby('img_filename')['obj_size','obj_type'].apply(lambda x: x.to_dict('records'))
df2 = df2.reset_index(name='obj')

# Assuming you have multiple same img files with different sizes then I'm choosing first.
# If this not the case then groupby directly and reset index.
#df1.groupby('img_filename, 'img_size')['obj_size','obj_type'].apply(lambda x: x.to_dict('records'))

df2['img_size'] = df1.groupby('img_filename')['img_size'].first().values

print (df2)

  img_filename                                                obj img_size
0     img1.txt  [{'obj_size': '5', 'obj_type': 'car'}, {'obj_s...       20
1     img2.txt             [{'obj_size': '4', 'obj_type': 'car'}]       25

Aggregate dataframe rows into a dictionary

Answers (2)

Related Questions