Viktor.w
Viktor.w

Reputation: 2297

Pandas: select the highest and lowest values between two specifc values from another column

My original dataframe looks like this:

 macd_histogram  direct    event
1.675475e-07    up  crossing up
2.299171e-07    up  0
2.246809e-07    up  0
1.760860e-07    up  0
1.899371e-07    up  0
1.543226e-07    up  0
1.394901e-07    up  0
-3.461691e-08  down crossing down
1.212740e-06    up  0
6.448285e-07    up  0
2.227792e-07    up  0
-8.738289e-08  down crossing up
-3.109205e-07  down 0

The column event is filled with crossing up and crossing down! What I would need is between a crossing up and a crossing down substract the highest value from the column macd_histogram (between the same index) and substract it from the lowest and add it to a new column next to crossing up!

I tried to do it with a for loop, but I am a bit lost on how to select the range between each crossing up and crossing down... any help? Thanks!

What I expect in fact (following the above dataframe):

 macd_histogram  direct    event magnitude
1.675475e-07    up  crossing up (0.851908-07)
2.299171e-07    up  0
2.246809e-07    up  0
1.760860e-07    up  0
1.899371e-07    up  0
1.543226e-07    up  0
1.394901e-07    up  0
-3.461691e-08  down crossing down (2.651908-06)
1.212740e-06    up  0
6.448285e-07    up  0
2.227792e-07    up  0
-8.738289e-08  down crossing up etc..
-3.109205e-07  down 0

This is what I tried so far:

index_up = df[df.event == 'crossing up'].index.values
index_down = df[df.event == 'crossing down'].index.values


df['magnitude'] = 0
array = np.array([])
for i in index_up:
    for idx in index_down:
        values = df.loc[i:idx, 'macd_histogram'].tolist()
        max = np.max(values)
        min = np.min(values)
        magnitutde = max-min
        print(magnitude)
       df.at[i,'magnitude'] = magnitude

But I have the following error message: ValueError: zero-size array to reduction operation maximum which has no identity

Upvotes: 0

Views: 76

Answers (1)

run-out
run-out

Reputation: 3184

I think I understand what you are asking for, but my result numbers don't match your example, so maybe I don't understand fully. Hopefully this much answer will help you.

First create a column to place the result.

df['result'] = np.nan

Create a variable with just the index of rows with crossing up/down.

event_range = df[df['event'] != '0'].index

Make a for loop to loop through the index array. Create a start and end index number for each section, get a maximum and minimum of the range for each start/end index number, and subtract and place to the right column.

for x in range(len(event_range)-1):    
    start = event_range[x]
    end = event_range[x+1] +1 # I'm not sure if this is the range you want

    max = df.iloc[start:end, 0].max()
    min = df.iloc[start:end, 0].min()

    diff = max - min
    df.iloc[start, 3] = diff

df


    macd_histogram  direct  event             result
0   1.675480e-07    up      crossing up       2.645339e-07
1   2.299170e-07    up      0                 NaN
2   2.246810e-07    up      0                 NaN
3   1.760860e-07    up      0                 NaN
4   1.899370e-07    up      0                 NaN
5   1.543230e-07    up      0                 NaN
6   1.394900e-07    up      0                 NaN
7  -3.461690e-08    down    crossing down     1.300123e-06
8   1.212740e-06    up      0                 NaN
9   6.448290e-07    up      0                 NaN
10  2.227790e-07    up      0                 NaN
11 -8.738290e-08    down    crossing up       NaN
12 -3.109210e-07    down    0                 NaN

Upvotes: 2

Related Questions