Reputation: 369
I have a dataframe that looks like in the pictures.
Now, I want to add a new column that will show the average of power for each day (given the data is sampled every 5 minutes), but separately for when it is day_or_night (day = 0 in the column, night = 1). I've gotten this far:
train['avg_by_day'][train['day_or_night']==1] = train['power'][train['day_or_night']==1].mean()
train['avg_by_day'][train['day_or_night']==0] = train['power'][train['day_or_night']==0].mean()
but this just adds the average of all the power values that correspond to day, or similarly - night, which isn't what I'm after: a specific average for each day/night separately.
I need something like: train['avg_by_day'] == train.power.mean() when day == 1 and day_or_night == 1, and this for each day.
Upvotes: 2
Views: 682
Reputation: 2482
So you want to group the dataframe by day
and day_or_night
and create a new column with mean power
values for each group:
train['avg_by_day'] = train.groupby(['day','day_or_night'])['power']\
.transform('mean')
Maybe you should also include year
and month
in the grouping columns because otherwise it's going to group the 1st day of every month together, same for the 2nd day and so on.
Upvotes: 3