Reputation: 29
I have a DataFrame like this:
df = pd.DataFrame({
'user_id': [0, 0, 0, 1, 1, 1, 2, 2, 2],
'0': [42, 42, 42, 4, 4, 4, 17, 17, 17],
'1': [81, 81, 81, 31, 31, 31, 54, 54, 54],
'2': [13, 13, 13, 7, 7, 7, 33, 33, 33]})
And my goal is DataFrame like this:
df = pd.DataFrame({
'user_id': [0, 0, 0, 1, 1, 1, 2, 2, 2],
'goal': [42, 81, 13, 4, 31, 7, 17, 54, 33]
})
I've tried df.unstack() but didn't succeed
Have any ideas how to achieve it?
Upvotes: 2
Views: 388
Reputation: 23217
Drop duplicates + set_index()
on user_id
+ .stack()
, as follows:
(df.drop_duplicates()
.set_index('user_id')
.stack()
.droplevel(-1)
.reset_index(name='goal')
)
Result:
user_id goal
0 0 42
1 0 81
2 0 13
3 1 4
4 1 31
5 1 7
6 2 17
7 2 54
8 2 33
Upvotes: 1
Reputation: 28669
You can remove the duplicates, melt and keep only the relevant columns:
(df.drop_duplicates()
.melt('user_id', value_name='goal', ignore_index=False)
.drop(columns='variable')
.sort_index()
)
user_id goal
0 0 42
0 0 81
0 0 13
3 1 4
3 1 31
3 1 7
6 2 17
6 2 54
6 2 33
Upvotes: 2