Reputation: 75
I have a list of stock data over the years and I want to remove the bottom x% in terms of market cap in each month. My idea is to make a loop that creates a new Pandas dataframe for each month, and then within that month, I remove the bottom x% in terms of market cap. This is what the data looks like
Upvotes: 0
Views: 639
Reputation: 30052
df['date'] = pd.to_datetime(df['date'])
df['year-month'] = df['date'].map(lambda x: x.strftime('%Y-%m'))
df = df.sort_values(['year-month', 'MarketCap'], ascending=[True, False])
df = df.groupby('year-month').apply(lambda x: x[x['MarketCap'] > x['MarketCap'].quantile(.1)]).reset_index(1, drop = True)
df = df.drop(columns=['year-month']).reset_index().drop(columns=['year-month'])
First create a column contains only year and month.
Then sort year-month
and MarketCap
column ascending and descending.
Group by MarketCap
column and filter out the row below 10% in each group.
Upvotes: 1