Modify Dataframe based on Priority

Question

I have a df like this:

ID   A1    A2     A3    A4    A5
1    1            2            3
2    1     2             3
3    2     1 
4    3            1            2
5

For every ID, I have 5 columns A1 to A5 (In real I have many more) and the values are top priority for a particular ID.

For example: ID 1 has A1, A3 and A5 as priorites, , ID 3 has only 2 A2 and A1 and ID 5 has no Priorities

Resultant DF

ID Priority_1  Priority_2  Priority_3
1    A1            A3         A5
2    A1            A2         A4
3    A2            A1   
4    A3            A5         A1
5

I am trying to same using melt and pivot using this and this_1 and many more, but exactly not able to get the same resultant df.

Any help on this or clarity from my side!!

jezrael · Accepted Answer

Use DataFrame.melt with sorting by DataFrame.sort_values and removing missing rows by DataFrame.dropna, then add new column used for filtering by boolean indexing and Series.le for less or equal and last use DataFrame.pivot with DataFrame.add_prefix, last add DataFrame.reindex for added only mising rows ID:

N = 3
df1 = df.melt('ID').sort_values(['ID','value']).dropna(subset=['value'])
df1['new'] = df1.groupby('ID').cumcount().add(1)
df1 = df1[df1['new'].le(N)]

df2 = df1.pivot('ID','new','variable').add_prefix('Priority_').reindex(df['ID'])
print (df2)
new Priority_1 Priority_2 Priority_3
ID                                  
1           A1         A3         A5
2           A1         A2         A4
3           A2         A1        NaN
4           A3         A5         A1
5          NaN        NaN        NaN

Modify Dataframe based on Priority

Answers (1)

Related Questions