wjvaug
wjvaug

Reputation: 23

Pandas: merge dataframe rows and take an average of the second column values

I have a two-column data frame with the first column containing a date (yyyy-mm-dd) and the second a rating out of five, i.e. '1' or '2' etc. The df is in order by date, with dates descending from the first row.

I am looking for a way to merge the rows containing identical date values - i.e. all of the 2021-05-05 and then take the average of all of the rating values for that given date to provide the corresponding rating average for that date.

For instance, if my df looks like this:

    Date        Rating
0  2021-05-05   1
1  2021-05-05   3
2  2021-05-05   2
3  2021-05-04   4
4  2021-05-04   6
5  2021-05-04   5

I want to merge it so it becomes like so:

    Date        Rating
0  2021-05-05   2
1  2021-05-04   5

Upvotes: 2

Views: 2613

Answers (2)

SeaBean
SeaBean

Reputation: 23217

You can use df.groupby() with sort=False (to maintain original sorting sequence) and as_index=False (to keep Date column as data column instead of row index). Use built-in aggregate function .mean() to get the average.

df.groupby('Date', as_index=False, sort=False)['Rating'].mean()

Output:

         Date  Rating
0  2021-05-05       2
1  2021-05-04       5

Upvotes: 2

imdevskp
imdevskp

Reputation: 2223

You can try pandas .groupby() and .mean()

result = df.groupby('Date', sort=False)['Rating'].mean().reset_index()

Upvotes: 0

Related Questions