Reputation: 576
I have a data frame df
index Heads
as 4
as 3
as 2
as 5
as 3
cd 4
cd 5
cd 6
Using the below code I can get the output as :
avg = df['Heads'].groupby(df.index).mean()
df.reset_index().pivot_table(columns=["index"]).T
index Heads
as 3.4
cd 5
But I have an other data frame df2
with an extra column of dates in it like:
index date Heads
as 01-02-2000 4
as 04-03-2002 3
as 09-01-2003 2
as 23-12-2010 5
as 14-04-2006 3
cd 04-01-2004 4
cd 04-05-2007 5
cd 04-05-2001 6
Here, I want to take the average of Heads
as in the above case, but only consider the elements which fall in between the years 2000 to 2005. Hence, the expected output is:
index Heads
as 3
cd 5
Upvotes: 3
Views: 65
Reputation: 863531
Use Series.between
with Series.dt.year
in boolean indexing
and then is possible use mean
with level
parameter:
df['date'] = pd.to_datetime(df['date'], dayfirst=True)
df = (df[df['date'].dt.year.between(2000, 2005)]
.mean(level=0)
.reset_index())
print (df)
index Heads
0 as 3
1 cd 5
Or:
df = (df[df['date'].dt.year.between(2000, 2005)]
.groupby(level=0).mean()
.reset_index())
print (df)
index Heads
0 as 3
1 cd 5
Upvotes: 1