Reputation: 45
As if statements and loops are very slow and difficult for pandas to compute, I'm trying to think how to rewrite the condition that I want to establish.
df = pandas.DataFrame()
df['x'] = [1.2, 1.5, 1.7, 1.9]
df['y'] = [1.7, 1.8, 0.7, 1.4]
print(df)
x y
0 1.2 1.7
1 1.5 1.8
2 1.7 0.7
3 1.9 7.0
What I want to do is to make a condition that verifies if df.y
will ever be less than df.x
and if that is true then create a new column with the subtraction of the two index.
For example,
df.y[0] < df.x[0]
if not,
then check if df.y[0] < df.x[1]
if that is true then df.new[0] = 1 - 0;
And move to the next value: df.y[1] < df.x[1]
if not, then check if df.y[1] < df.x[2]
if that is true then df.new[1] = 2 - 1
.
If the value of df.y[i]
is always greater than any df.x[n]
value, then append as False
to df.new[i]
.
the output in this case should be like this
x y new
0 1.2 1.7 3
1 1.5 1.8 2
2 1.7 0.7 0
3 1.9 7.0 False
in which df.new
is the difference between the index of df.x
with the index of df.y
that we are trying to prove if will ever be lower than df.x
df.new could mean whatever the index is, if the index is the time df.new means the first time in which df.y
will be lower that df.x
.
Upvotes: 1
Views: 83
Reputation: 394439
The following works but this really is a loop as it uses apply
:
In [225]:
def func(x):
if (x['y'] > df['x']).all():
return False
else:
return (x['y'] > df['x']).idxmin()
df['new'] = df.apply(lambda row: func(row), axis=1)
df
Out[225]:
x y new
0 1.2 1.7 2
1 1.5 1.8 3
2 1.7 0.7 0
3 1.9 7.0 False
Upvotes: 1