geds133
geds133

Reputation: 1485

How to merge values on datetime column whilst taking average

I am trying to merge my columns by a datetime column, date, whilst taking and average on the yearly_cost columns. My df looks as such:

        date        yearly_cost_x  yearly_cost_y  yearly_cost
0     2009-01-01        5               7              3
1     2009-01-02        8               7              4
2     2009-01-03        23              6              6

I wish to merge the df by 'date' and once merge take an average of the 3 values to create a single value for every row in a columns named Yearly_Cost. I feel this should be easy but am somehow struggling and receiving several errors.

I wish for my df output to looks as such:

        date        Yearly_Cost
0     2009-01-01        5
1     2009-01-02        6.33
2     2009-01-03        11.66

Any help would be greatly appreciated!

ADDITION:

I have a column with multiple dates and one yearly_cost column as such. It looks as such:

        date        Yearly_Cost
0     2009-01-01        5
1     2009-01-02        6
2     2009-01-03        11
3     2009-01-01        12
4     2009-01-02        45
5     2009-01-03        32

I wish for it to look this this:

        date        Yearly_Cost
0     2009-01-01        8.5
1     2009-01-02        25.5
2     2009-01-03        21.5

Upvotes: 1

Views: 63

Answers (2)

jezrael
jezrael

Reputation: 862651

Use DataFrame.set_index with mean per rows with axis=1 and last for DataFrame is used Series.reset_index:

df1 = df.set_index('date').mean(axis=1).reset_index(name='Yearly_Cost')
print (df1)
         date  Yearly_Cost
0  2009-01-01     5.000000
1  2009-01-02     6.333333
2  2009-01-03    11.666667

If possible another non yearly columns filter columns by DataFrame.filter:

df1 = df.set_index('date').filter(like='yearly_').mean(axis=1).reset_index(name='Yearly_Cost')

Upvotes: 1

nassim
nassim

Reputation: 1555

try with this :

df['Yearly_Cost'] = df['Yearly_Cost'].apply(lambda x : (x['yearly_cost_x']+x['yearly_cost_y']+x['yearly_cost'])/3 )
df.drop(['yearly_cost_x','yearly_cost_y','yearly_cost'],axis=1)

Upvotes: 0

Related Questions