Reputation: 15
I have a dataframe which is the result of screening through all the S&P500 stocks and is classifying some SOS signals found during the screening and the date where each signal is appearing.
For this dataframe I'm applying the following filter, because I only want to be focused on stocks which have these signals: Test in Rising, Absorption Vol, Stopping Vol for each ticker
s = {'Stopping Vol', 'Absorption Vol', 'Test in Rising'}
df.groupby('Ticker').filter(lambda sf: s.issubset(sf['SOS']))
And this is the output:
Ticker SOS Date frequency
756 BXSL Test in Rising 2021-12-21 11
757 BXSL Stopping Vol 2021-12-27 11
758 BXSL Stopping Vol 2021-12-28 11
759 BXSL Selling Climax 2021-12-28 11
760 BXSL Shakeout 2021-12-28 11
761 BXSL Absorption Vol 2021-12-30 11
762 BXSL Bag Holding 2021-12-30 11
763 BXSL Test in Rising 2021-12-30 11
764 BXSL Stopping Vol 2022-01-03 11
765 BXSL Absorption Vol 2022-01-06 11
766 BXSL Test in Rising 2022-01-06 11
5057 WDAY Test in Rising 2021-12-16 9
5058 WDAY Test 2021-12-20 9
5059 WDAY Test in Rising 2021-12-20 9
5060 WDAY Test in Rising 2021-12-22 9
5061 WDAY Test in Rising 2021-12-29 9
5062 WDAY Stopping Vol 2022-01-04 9
5063 WDAY Selling Climax 2022-01-04 9
5064 WDAY Shakeout 2022-01-04 9
5065 WDAY Absorption Vol 2022-01-05 9
1167 CROX Stopping Vol 2021-12-17 9
The problem I have now is that I only want as a result those tickers where these signals are met, but I want also to keep the order of the signals by date of appearance. I want to show Ticker only when I see Stopping Vol signal first, then a Absorption Vol signal and finally a Test in Rising signal. It doesn't matter if there are any other signals in between, I just want to keep that order in between them.
So the output would be something similar to:
Ticker SOS Date frequency
756 JPGY Selling Climax 2021-12-21 11
757 JPGY Stopping Vol 2021-12-27 11
758 JPGY Other signal 2021-12-28 11
761 JPGY Absorption Vol 2021-12-30 11
763 JPGY Test in Rising 2021-12-31 11
How can I get this?
I'm having a look to this other question and don't know how to progress. pandas groupby sort within groups
Upvotes: 0
Views: 35
Reputation: 862601
Use:
#in set is not defined order, so use list ot tuple
L = ['Stopping Vol', 'Absorption Vol', 'Test in Rising']
#filter rows if match all 3 values
df1 = df.groupby('Ticker').filter(lambda sf: set(L).issubset(sf['SOS']))
#get only rows with L
df1 = df1[df1['SOS'].isin(L)]
#get first and last values per filtered df1
df1 = df1.groupby(['Ticker'])['SOS'].agg(['first','last'])
#and test if first column is first value in L and last column last value in L
#middle values is already tested, so got only groups in this order
tickers = df1.index[df1['first'].eq(L[0]) & df1['last'].eq(L[2])]
df2 = df[df['Ticker'].isin(tickers)]
print (df2)
Upvotes: 1