Avijeet Singh
Avijeet Singh

Reputation: 3

Taking average values in column B if column A has the same date but different time

So, I have a panda dataframe which has a column with the data and time and another column with a float value. now I want to take an average of the values in the second column if the date is the same irrespective of the time and want to display just one value for that date. can anyone help with that?

E.g-

**1st row**- Column A - '2018-12-08 04:51:58', Column B- '3', Column C- '4'
**2nd row**- Column A - '2018-12-08 04:51:49', Column B- '2', Column C- '3'
**3rd row**- Column A - '2018-12-08 04:51:57', Column B- '1', Column C- '2'

The output that I want is :

Column A- '2018-12-08', Column B- '2', Column C- '3'

P.S the dtype for column A is datetime64[ns], and i have over 3k rows with different datetime and values

Can anyone help with that?

Upvotes: 0

Views: 60

Answers (1)

anky
anky

Reputation: 75100

EDIT As you say your dataframe is datetime64[ns], something like:

    date                C
0   2018-09-14 00:00:00 4
1   2018-09-14 00:10:00 3
2   2018-09-14 01:10:00 2` 

use:

df.groupby(df['date'].dt.date).mean().reset_index()

or :

df.groupby(df['date'].dt.floor('d')).mean().reset_index()

Output:

    date        C
0   2018-09-14  3

For getting the sum and dividing by 2:

df.groupby(df['date'].dt.date).apply(lambda x : x['C'].sum()/2).reset_index().rename(columns={0:'new'})

or (one more method which i just found)

df.groupby(df['date'].dt.floor('d')).apply(lambda x : x['C'].sum()/2).reset_index().rename(columns={0:'new'})

Output:

    date        new
0   2018-09-14  4.5

Upvotes: 1

Related Questions