Reputation: 131
I have a data frame representing the customers checkins (visits) of restaurants. year
is simply the year when a checkin in a restaurant happened .
average_checkin
to my initial Dataframe df
that represents the average number of visits of a restaurant per year.data = {
'restaurant_id': ['--1UhMGODdWsrMastO9DZw', '--1UhMGODdWsrMastO9DZw','--1UhMGODdWsrMastO9DZw','--1UhMGODdWsrMastO9DZw','--1UhMGODdWsrMastO9DZw','--1UhMGODdWsrMastO9DZw','--6MefnULPED_I942VcFNA','--6MefnULPED_I942VcFNA','--6MefnULPED_I942VcFNA','--6MefnULPED_I942VcFNA'],
'year': ['2016','2016','2016','2016','2017','2017','2011','2011','2012','2012'],
}
df = pd.DataFrame (data, columns = ['restaurant_id','year'])
# here i count the total number of checkins a restaurant had
d = df.groupby('restaurant_id')['year'].count().to_dict()
df['nb_checkin'] = df['restaurant_id'].map(d)
mean_checkin= df.groupby(['restaurant_id','year']).agg({'nb_checkin':[np.mean]})
mean_checkin.columns = ['mean_checkin']
mean_checkin.reset_index()
# the values in mean_checkin makes no sens
#I need to merge it with df to add that new column
I am still new with the pandas lib, I tried something like this but my results makes no sens. Is there something wrong with my syntax? If any clarifications needed, please ask.
Upvotes: 0
Views: 1375
Reputation: 13417
The average number of visits per year can be calculated as the total number of visits a restaurant has, divided by the number of unique years you have data for.
grouped = df.groupby(["restaurant_id"])
avg_annual_visits = grouped["year"].count() / grouped["year"].nunique()
avg_annual_visits = avg_annual_visits.rename("avg_annual_visits")
print(avg_annual_visits)
restaurant_id
--1UhMGODdWsrMastO9DZw 3.0
--6MefnULPED_I942VcFNA 2.0
Name: avg_annual_visits, dtype: float64
Then if you wanted to merge it back to your original data:
df = df.merge(avg_annual_visits, left_on="restaurant_id", right_index=True)
print(df)
restaurant_id year avg_annual_visits
0 --1UhMGODdWsrMastO9DZw 2016 3.0
1 --1UhMGODdWsrMastO9DZw 2016 3.0
2 --1UhMGODdWsrMastO9DZw 2016 3.0
3 --1UhMGODdWsrMastO9DZw 2016 3.0
4 --1UhMGODdWsrMastO9DZw 2017 3.0
5 --1UhMGODdWsrMastO9DZw 2017 3.0
6 --6MefnULPED_I942VcFNA 2011 2.0
7 --6MefnULPED_I942VcFNA 2011 2.0
8 --6MefnULPED_I942VcFNA 2012 2.0
9 --6MefnULPED_I942VcFNA 2012 2.0
Upvotes: 1