Reputation: 165
I tried to use groupby
with pandas but being fairly new to python i can't seem to find a solution
raw_data = {'Products': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C'],
'Month': ['201903', '201903', '201902', '201901', '201902', '201901', '201902', '201904','201903', '201902', '201904', '201903'],
'Sales': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3]}
df = pd.DataFrame(raw_data, columns = ['Products', 'Month', 'Sales'])
df
data looks like so
Products Month Sales
0 A 201903 4
1 A 201903 24
2 A 201902 31
3 A 201901 2
4 B 201902 3
5 B 201901 4
6 B 201902 24
7 C 201904 31
8 C 201903 2
9 C 201902 3
10 C 201904 2
11 C 201903 3
and i need , per Product to display latest two months with sum of sales in said months like so
Products Months Sales
A 201902 31
A 201903 28
B 201901 4
B 201902 27
C 201903 5
C 201904 33
i'm sorry if everything is not formatted correctly, still new to SO
Thank you
Upvotes: 2
Views: 593
Reputation: 27869
This will do it:
(df.groupby(['Products', 'Month'], as_index=False)
.sum()
.sort_values(['Products', 'Sales'],
ascending=(True,False))
.groupby('Products')
.head(2))
Products Month Sales
1 A 201902 31
2 A 201903 28
4 B 201902 27
3 B 201901 4
7 C 201904 33
6 C 201903 5
Upvotes: 2