Reputation: 324
A pivot table is counting the monthly occurrences of a phenomenon. Here's the simplified sample data followed by the pivot:
+--------+------------+------------+
| ad_id | entreprise | date |
+--------+------------+------------+
| 172788 | A | 2020-01-28 |
| 172931 | A | 2020-01-26 |
| 172793 | B | 2020-01-26 |
| 172768 | C | 2020-01-19 |
| 173219 | C | 2020-01-14 |
| 173213 | D | 2020-01-13 |
+--------+------------+------------+
My pivot_table code is the following:
my_pivot_table = pd.pivot_table(df[(df['date'] >= some_date) & ['date'] <= some_other_date)],
values=['ad_id'], index=['entreprise'],
columns=['year', 'month'], aggfunc=['count'])
The resulting table looks like this:
+-------------+---------+----------+-----+----------+
| | 2018 | | | |
+-------------+---------+----------+-----+----------+
| entreprise | january | february | ... | december |
| A | 12 | 10 | ... | 8 |
| B | 24 | 12 | ... | 3 |
| ... | ... | ... | ... | ... |
| D | 31 | 18 | ... | 24 |
+-------------+---------+----------+-----+----------+
Now, I would like to add a column that gives me the monthly average, and perform other operations such as comparing last month's count to the monthly average of, say, the last 12 months...
I tried to fiddle with the aggfunc parameter of the pivot_table, as well as trying to add an average column to the original dataframe, but without success.
Thanks in advance!
Upvotes: 1
Views: 691
Reputation: 863226
Because you get Multiindex
table after pivot_table
you can use:
df1 = df.mean(axis=1, level=0)
df1.columns = pd.MultiIndex.from_product([df1.columns, ['mean']])
Or:
df2 = df.mean(axis=1, level=1)
df2.columns = pd.MultiIndex.from_product([['all_years'], df2.columns])
Upvotes: 3