Reputation: 347
I have a data frame where one objectid maps to one collisionid (not unique)
df1
objectid collisionid
0 3000 0002
1 4000 0001
2 5000 0002
I have another data frame containing all possible collision ids:
df2
collisionid
0 0001
1 0002
2 0003
I want to update df2 to contain a row of all the object ids that map to that collision:
df2 - DESIRED
collisionid objectids
0 0001 4000
1 0002 3000,5000
2 0003
(in this case the objectids are stored as strings).
I have attempted doing something like this:
df2['objectids'] = df2['collisionid'].map(df1.set_index('collisionid')['objectid'])
but I get an error Reindexing only valid with uniquely valued Index objects
. How do I go about doing this and also trying to merge the strings?
Upvotes: 2
Views: 117
Reputation: 402323
Start by merging them, then group and aggregate:
(df2.merge(df1, how='left', on='collisionid')
.fillna('')
.groupby('collisionid', as_index=False)
.agg(','.join))
collisionid objectid
0 0001 4000
1 0002 3000,5000
2 0003
For a specific column, your agg
syntax changes a bit:
(df2.merge(df1, how='left', on='collisionid')
.fillna('')
.groupby('collisionid')
.agg(objectid=('objectid', ','.join)) # agg(output_name=(input_name, aggfunc))
.reset_index())
collisionid objectid
0 0001 4000
1 0002 3000,5000
2 0003
Upvotes: 4