Reputation: 820
I have a dataframe in long format with columns: date, ticker, mcap, rank_mcap. The mcap columns is "marketcap" and measure how large a certain stock is, and mcap_rank is simply the ranked verson of it (where 1 is the largest marketcap).
I want to create a top 10 market cap weighted asset (e.g. S&P10). In R I do this
df %>%
filter(day(date) == 1, rank_mcap < 11) %>%
group_by(date) %>%
mutate(weight = mcap / sum(mcap)) %>%
ungroup() %>%
What do I do in pandas? I get the following error
AttributeError: Cannot access callable attribute 'assign' of 'DataFrameGroupBy' objects, try using the 'apply' method
when I tro do to a similar approach like the R method, namely in python do this:
df.\
query('included == True & date.dt.day == 1'). \
groupby('date').\
assign(w=df.mcap / df.mcap.sum())
I studied http://pandas.pydata.org/pandas-docs/stable/comparison_with_r.html and did not come to a conclusion.
Upvotes: 0
Views: 231
Reputation: 3835
You can do it in the same way as you did in R using datar
:
from datar.all import f, filter, group_by, ungroup, mutate, sum
df >> \
filter(f.date.day == 1, f.rank_mcap < 11) >> \
group_by(f.date) >> \
mutate(weight = f.mcap / sum(f.mcap)) >> \
ungroup()
Disclaimer: I am the author of the datar
package.
Upvotes: 0
Reputation: 323316
How pandas achieve Mutate in R
df.query('included == True & date.dt.day == 1').\
assign(weight = lambda x : x.groupby('date',group_keys=False).
apply(lambda y: y.mcap / y.mcap.sum()))
Upvotes: 1