Reputation: 111
I have an input dataframe as ID Visit11 Visit12 Visit13 Visit14 Visit15
1 Orange
2 Orange
2 Apple
3 Grapes
4 Apple
5 Not Defined
5 Apple
6 Apple
7 Banana
7
7
7
7
7
7
8 Banana
8 Apple
8 Banana
8 Apple
8 Banana
9
9
9
9
I am using groupby to get the expected output but it's clubbing all the purchase into 1 cell. I want the purchase to be clubbed in different columns where 1 row is for 1 user. The expected output should be
ID Visit11 Visit12 Visit13 Visit1Int4 Visit15
1 Orange
2 Orange Apple
3 Grapes
4 Apple
5 Not Defined Apple
6 Apple
7 Banana
8 Banana Apple Banana Apple Banana
9
Upvotes: 1
Views: 35
Reputation: 862911
I believe you need:
print (df)
ID Visit11 Visit12
0 1 Orange
1 2 Orange
2 2 Apple
3 3 Grapes
4 4 Apple
5 5 Not Defined
6 5 Apple
df = df.replace('', np.nan)
df1 = df.set_index('ID').stack().unstack().sort_index(axis=1).reset_index().fillna('')
print (df1)
ID Visit11 Visit12
0 1 Orange
1 2 Apple Orange
2 3 Grapes
3 4 Apple
4 5 Not Defined Apple
Alternative solution:
df = df.replace('', np.nan)
df1 = df.groupby('ID', as_index=False).first().fillna('')
print (df1)
ID Visit11 Visit12
0 1 Orange
1 2 Apple Orange
2 3 Grapes
3 4 Apple
4 5 Not Defined Apple
Upvotes: 1