Kallol
Kallol

Reputation: 2189

Assign to the previous values when duplicate value is found in pandas data frame

I have a data frame like this,

df 
col1      col2       col3
 A        [1,2]      [[1,2],[3,4]]
 B        [5]        [[6,7]]
 C        [8,9]      [[10,11],[12,13]]
 A        [14]       [[15,16]]

Now if the columns values of col1 is duplicated then assign col2 and col3 to it's original values, so the final data frame would look like,

col1      col2           col3
A         [1,2, 14]      [[1,2],[3,4],[15,16]]
B         [5]            [[6,7]]
C         [8,9]          [[10,11],[12,13]]

A values of last rows is assigned to the first column where A is present. I could do this using a for loop and comparing with previous values, but the execution time will be huge, so looking for some pandas shortcuts to do this most efficiently.

Upvotes: 0

Views: 201

Answers (1)

BENY
BENY

Reputation: 323326

Try with groupby sum

newdf = df.groupby('col1',as_index=False).sum()
Out[31]: 
  col1        col2                        col3
0    A  [1, 2, 14]  [[1, 2], [3, 4], [15, 16]]
1    B         [5]                    [[6, 7]]
2    C      [8, 9]        [[10, 11], [12, 13]]

Upvotes: 1

Related Questions