Pandas, duplicate a row based on a condition

Question

I have a dataframe like this -

What I want to do is, whenever there is 'X' in Col3, that row should get duplicated and 'X' should be changed to 'Z'. The result must look like this -

I did try a few approaches, but nothing worked! Can somebody please guide on how to do this.

jezrael · Accepted Answer

You can filter first by boolean indexing and set Z to Col3 by DataFrame.assign, join with original with concat, sorting index by DataFrame.sort_index with stabble algo mergesort and last create default RangeIndex by DataFrame.reset_index with drop=True:

df = pd.DataFrame({
         'B':[4,5,4,5,5,4],
         'C':[7,8,9,4,2,3],
        'Col3':list('aXcdXf'),
         'D':[1,3,5,7,1,0],
         'E':[5,3,6,9,2,4],
         'F':list('aaabbb')
})


df = (pd.concat([df, df[df['Col3'].eq('X')].assign(Col3 = 'Z')])
        .sort_index(kind='mergesort')
        .reset_index(drop=True))
print (df)
   B  C Col3  D  E  F
0  4  7    a  1  5  a
1  5  8    X  3  3  a
2  5  8    Z  3  3  a
3  4  9    c  5  6  a
4  5  4    d  7  9  b
5  5  2    X  1  2  b
6  5  2    Z  1  2  b
7  4  3    f  0  4  b

Pandas, duplicate a row based on a condition

Answers (1)

Related Questions