Reputation: 696
I would like to count all product_id's
per region grouped by month.
Before doing that I need to map the zip code to the region using the duct below:
regiondict = {
"10": "A",
"23": "A",
"24": "A",
"53": "B",
"32": "B",
"49": "B",
"11": "C",
"14": "C",
"65": "C"
}
Main df
product_id zip_code month
32324 10 Feb-2019
34345 23 Feb-2019
45453 24 March-2019
34343 53 Feb-2019
53533 32 March-2019
12334 49 March-2019
99934 11 Feb-2019
21213 14 Feb-2019
13232 65 March-2019
Output df
region count_product_id month
A 2 Feb-2019
A 1 March-2019
B 1 Feb-2019
B 2 March-2019
B 2 Feb-2019
B 1 March-2019
Upvotes: 0
Views: 125
Reputation: 323276
Basically you can do map
with groupby
count
, plus some cosmetic after.
df_ = (df.groupby([df.zip_code.astype(str).map(regiondict), 'month'])
['product_id'].count()
.reset_index()
.rename(columns={'zip_code':'region'}))
print(df_)
region month product_id
0 A Feb-2019 2
1 A March-2019 1
2 B Feb-2019 1
3 B March-2019 2
4 C Feb-2019 2
5 C March-2019 1
Upvotes: 1