Gun
Gun

Reputation: 576

Using pandas how to find the average of certain elements in a column based on their date?

I have a data frame df

index  Heads
as     4
as     3
as     2
as     5
as     3
cd     4
cd     5
cd     6

Using the below code I can get the output as :

avg = df['Heads'].groupby(df.index).mean()
df.reset_index().pivot_table(columns=["index"]).T

index  Heads
as     3.4   
cd     5

But I have an other data frame df2 with an extra column of dates in it like:

index  date         Heads
as     01-02-2000   4
as     04-03-2002   3
as     09-01-2003   2
as     23-12-2010   5
as     14-04-2006   3
cd     04-01-2004   4
cd     04-05-2007   5
cd     04-05-2001   6

Here, I want to take the average of Heads as in the above case, but only consider the elements which fall in between the years 2000 to 2005. Hence, the expected output is:

index  Heads
as     3   
cd     5  

Upvotes: 3

Views: 65

Answers (1)

jezrael
jezrael

Reputation: 863531

Use Series.between with Series.dt.year in boolean indexing and then is possible use mean with level parameter:

df['date'] = pd.to_datetime(df['date'], dayfirst=True)


df = (df[df['date'].dt.year.between(2000, 2005)]
         .mean(level=0)
         .reset_index())
print (df)
  index  Heads
0    as      3
1    cd      5

Or:

df = (df[df['date'].dt.year.between(2000, 2005)]
         .groupby(level=0).mean()
         .reset_index())
print (df)
  index  Heads
0    as      3
1    cd      5

Upvotes: 1

Related Questions