Doctor_W
Doctor_W

Reputation: 45

pandas: condition with different index

As if statements and loops are very slow and difficult for pandas to compute, I'm trying to think how to rewrite the condition that I want to establish.

df = pandas.DataFrame()
df['x'] = [1.2, 1.5, 1.7, 1.9]
df['y'] = [1.7, 1.8, 0.7, 1.4]
print(df)

     x    y
0  1.2  1.7
1  1.5  1.8
2  1.7  0.7
3  1.9  7.0

What I want to do is to make a condition that verifies if df.y will ever be less than df.x and if that is true then create a new column with the subtraction of the two index.

For example,

df.y[0] < df.x[0] if not,

then check if df.y[0] < df.x[1] if that is true then df.new[0] = 1 - 0;

And move to the next value: df.y[1] < df.x[1]

if not, then check if df.y[1] < df.x[2] if that is true then df.new[1] = 2 - 1.

If the value of df.y[i] is always greater than any df.x[n] value, then append as False to df.new[i].

the output in this case should be like this

     x    y    new
0  1.2  1.7      3
1  1.5  1.8      2
2  1.7  0.7      0
3  1.9  7.0  False

in which df.new is the difference between the index of df.x with the index of df.y that we are trying to prove if will ever be lower than df.x

df.new could mean whatever the index is, if the index is the time df.new means the first time in which df.y will be lower that df.x.

Upvotes: 1

Views: 83

Answers (1)

EdChum
EdChum

Reputation: 394439

The following works but this really is a loop as it uses apply:

In [225]:
def func(x):
    if (x['y'] > df['x']).all():
        return False
    else:
        return (x['y'] > df['x']).idxmin()
df['new'] = df.apply(lambda row: func(row), axis=1)
df

Out[225]:
     x    y    new
0  1.2  1.7      2
1  1.5  1.8      3
2  1.7  0.7      0
3  1.9  7.0  False

Upvotes: 1

Related Questions