saremisona
saremisona

Reputation: 369

Calculating mean of a specific column by specific rows

I have a dataframe that looks like in the pictures.enter image description here

Now, I want to add a new column that will show the average of power for each day (given the data is sampled every 5 minutes), but separately for when it is day_or_night (day = 0 in the column, night = 1). I've gotten this far:

train['avg_by_day'][train['day_or_night']==1] = train['power'][train['day_or_night']==1].mean() train['avg_by_day'][train['day_or_night']==0] = train['power'][train['day_or_night']==0].mean()

but this just adds the average of all the power values that correspond to day, or similarly - night, which isn't what I'm after: a specific average for each day/night separately.

I need something like: train['avg_by_day'] == train.power.mean() when day == 1 and day_or_night == 1, and this for each day.

Upvotes: 2

Views: 682

Answers (1)

gereleth
gereleth

Reputation: 2482

So you want to group the dataframe by day and day_or_night and create a new column with mean power values for each group:

train['avg_by_day'] = train.groupby(['day','day_or_night'])['power']\
                           .transform('mean')

Maybe you should also include year and month in the grouping columns because otherwise it's going to group the 1st day of every month together, same for the 2nd day and so on.

Upvotes: 3

Related Questions