Reputation: 4100
I have a df like this:
ID A1 A2 A3 A4 A5
1 1 2 3
2 1 2 3
3 2 1
4 3 1 2
5
For every ID, I have 5 columns A1 to A5 (In real I have many more) and the values are top priority for a particular ID.
For example: ID 1 has A1, A3 and A5 as priorites, , ID 3 has only 2 A2 and A1 and ID 5 has no Priorities
Resultant DF
ID Priority_1 Priority_2 Priority_3
1 A1 A3 A5
2 A1 A2 A4
3 A2 A1
4 A3 A5 A1
5
I am trying to same using melt
and pivot
using this and this_1 and many more, but exactly not able to get the same resultant df.
Any help on this or clarity from my side!!
Upvotes: 3
Views: 76
Reputation: 862611
Use DataFrame.melt
with sorting by DataFrame.sort_values
and removing missing rows by DataFrame.dropna
, then add new column used for filtering by boolean indexing
and Series.le
for less or equal
and last use DataFrame.pivot
with DataFrame.add_prefix
, last add DataFrame.reindex
for added only mising rows ID
:
N = 3
df1 = df.melt('ID').sort_values(['ID','value']).dropna(subset=['value'])
df1['new'] = df1.groupby('ID').cumcount().add(1)
df1 = df1[df1['new'].le(N)]
df2 = df1.pivot('ID','new','variable').add_prefix('Priority_').reindex(df['ID'])
print (df2)
new Priority_1 Priority_2 Priority_3
ID
1 A1 A3 A5
2 A1 A2 A4
3 A2 A1 NaN
4 A3 A5 A1
5 NaN NaN NaN
Upvotes: 3