Reputation: 2189
I have a data frame like this,
df
col1 col2 col3
A [1,2] [[1,2],[3,4]]
B [5] [[6,7]]
C [8,9] [[10,11],[12,13]]
A [14] [[15,16]]
Now if the columns values of col1 is duplicated then assign col2 and col3 to it's original values, so the final data frame would look like,
col1 col2 col3
A [1,2, 14] [[1,2],[3,4],[15,16]]
B [5] [[6,7]]
C [8,9] [[10,11],[12,13]]
A values of last rows is assigned to the first column where A is present. I could do this using a for loop and comparing with previous values, but the execution time will be huge, so looking for some pandas shortcuts to do this most efficiently.
Upvotes: 0
Views: 201
Reputation: 323326
Try with groupby
sum
newdf = df.groupby('col1',as_index=False).sum()
Out[31]:
col1 col2 col3
0 A [1, 2, 14] [[1, 2], [3, 4], [15, 16]]
1 B [5] [[6, 7]]
2 C [8, 9] [[10, 11], [12, 13]]
Upvotes: 1