Reputation: 747
Subs_1718 Count_1718 Subs_1819 Count_1819 Subs_1920 Count_1920
Apple 10.0 Grapes 12 Banana 12.0
Grapes 2.0 Apple 6 Grapes 8.0
Banana 2.0 Pineapple 3 Cashew 1.0
Dragonfruit 1.0 Banana 2 Apple 1.0
Kiwi 1.0 Kiwi 2 Melon 1.0
Melon 1.0 Cashew 1 Grapes 1.0
How do I create a new column that value_counts the values of the columns of df['Count_1718']
,
df['Count_1819']
,df['Count_1920']
?
Expected Output:
Subs_1720 Count_1720
Apple 17
Banana 16
Cashew 2
Dragonfruit 1
Grapes 22
Melon 2
Pineapple 1
Upvotes: 1
Views: 126
Reputation: 28644
You can convert the columns to a multiindex, stack, and do a groupby :
df.columns = df.columns.str.split("_", expand=True)
(df.stack()
.groupby("Subs")
.sum()
.reset_index()
.set_axis(["Subs_1720", "Count_1720"], axis=1))
Subs_1720 Count_1720
0 Apple 17.0
1 Banana 16.0
2 Cashew 2.0
3 Dragonfruit 1.0
4 Grapes 23.0
5 Kiwi 3.0
6 Melon 2.0
7 Pineapple 3.0
Upvotes: 2
Reputation: 862641
Use wide_to_long
with aggregate sum
:
df1 = (pd.wide_to_long(df.reset_index(),
i='index',
stubnames=['Subs','Count'],
sep='_',
j='d')
.groupby('Subs')['Count']
.sum()
.rename_axis('Subs_1720')
.reset_index(name='Count_1720'))
print (df1)
Subs_1720 Count_1720
0 Apple 17.0
1 Banana 16.0
2 Cashew 2.0
3 Dragonfruit 1.0
4 Grapes 23.0
5 Kiwi 3.0
6 Melon 2.0
7 Pineapple 3.0
Upvotes: 2
Reputation: 88236
You can use pd.wide_to_long
here specifying the corresponding stubnames with a groupby.sum
:
(pd.wide_to_long(df.reset_index(),
stubnames=['Subs', 'Count'],
i='index', j='ix',
suffix= '_\d+')
.groupby('Subs').sum())
Count
Subs
Apple 17.0
Banana 16.0
Cashew 2.0
Dragonfruit 1.0
Grapes 23.0
Kiwi 3.0
Melon 2.0
Pineapple 3.0
Upvotes: 4