HelloToEarth
HelloToEarth

Reputation: 2127

Annual weighted average of pandas dataframe

Intermediate/Beginner Python user here. I need to figure out how to calculate annual weights and weighted averages based upon some stock market data I parsed into dataframes.

I have stock market values and dates from some interval 2003-2018 that is given like so (there are a lot more values than this snippet shows):

ticker  date    marketcap       open    high    low     close
A   2003-03-31  8466487038.0    13.38   13.47   13.0    13.15
A   2003-06-30  11273789220.5   19.5    19.76   19.46   19.55
AA  2017-03-31  6031414196.0    34.06   34.74   33.6    34.4
AA  2017-06-30  6426297428.0    33.16   33.45   32.535  32.65
ICE 2016-03-31  29363680289.5   46.396  47.184  46.242  47.028
ICE 2016-06-30  31796255816.0   50.602  51.349  50.321  51.192

I simply want to take an annual weight for each ticker's marketcap.

Ie:

Weight(YEAR) = AVGTickerMarketCap(YEAR) / AllTickersMarketCap(YEAR)

Since there is more than one value for marketcap every year I need to take an average first of the marketcap for 2003, 2004, 2005...etc for each ticker - hence the AVGTickerMarketCap(YEAR).

I'm only able to figure out how to somewhat calculate AllTickersMarketCap(YEAR):

df1['date'] = pd.to_datetime(df1.date)
avg_all = df1.groupby(df1.date.dt.year['marketcap'].transform('mean')

But this only seems to take the first ticker "A" and applies the same mean for "A" to every year regardless of their ticker.

Where is the issue and how would I then go about finding AVGTickerMarketCap(YEAR)?

Upvotes: 0

Views: 364

Answers (1)

Barker
Barker

Reputation: 2094

The correct syntax to get a column containing your yearly averages would be

avg_all = df1['marketcap'].groupby(df1.date.dt.year).transform('mean')

Similarly to get the average for each year and ticker you would run:

avg_all = df1['marketcap'].groupby([df1.date.dt.year,df1.ticker]).transform('mean')

If you don't want it to be a column that you can map back to your dataframe but instead just want a new dataframe of your values for each year/ticker you can sub out .transform('mean') with .mean().

In your example data each ticker has a unique year that is the same for all values of that ticker and different from all other ticker values so grouping by year and grouping by year and ticker are equivalent operations.

Upvotes: 1

Related Questions