Junaid Mohammad
Junaid Mohammad

Reputation: 477

how to select partial data from a pandas dataframe

I have a dataframe, flag=np.sign(value)

ticker    value    flag    cumsum(flag)
A1        -1.5     -1         -1
B1         1.4      1         0       
C3        -2.4     -1         -1 
D4        -1.8     -1         -2
E6        -1.6     -1         -3

I have a variable, cutoff = 1 (it is always +ve, its a modulus)

how can I best select the tickers where abs(cumsum(flag)) <= 1

i.e., expected output is [A1, B1, C3]

i.e I want to keep going down the cumsum list until I find the LAST 1 or -1

I tried a loop:

ticker_list_keep = []
for y in range(0, len(df['cumsum']), 1):
if abs(df['cumsum'][y]) < abs(capacity) + 1: 
ticker_list_keep.append(df.index[y])  

but this would give me only A1 and C3, and would miss out B2

Thanks

per note on comments: @Vaishali - The question is not a duplicate. I wanted ALL the values in the ticker list, up until we get to the final -1 in the cumsum list.

Above, we get to the final abs(val)=1 at C3, so my list is C3, B1,A1.

Your solution the thread you pointed me to gives only A1 and C3.

You notice A1 is not the final -1 in the cumsum list, therefore A1 alone doesn't suffice. We note C3 is where the final +/-1 occurs therefore our required list is A1,B1,C3

Thanks!!

Upvotes: 1

Views: 635

Answers (1)

Vaishali
Vaishali

Reputation: 38415

You can find last valid index based on your condition and create a slice.

idx = df[df['cumsum(flag)'].abs() <= 1].last_valid_index()
df.loc[:idx, :]

    ticker  value   flag    cumsum(flag)
0   A1      -1.5    -1      -1
1   B1       1.4     1       0
2   C3      -2.4    -1      -1

Upvotes: 2

Related Questions