Reputation: 889
Keen to know if it's possible to reproduce the following logic in Python, in one line, without creating a function.
z = IF (y-x) = 0 THEN a ELSE (y-x)
Using this df:
df = pd.DataFrame({'x': [10, 22, 31, 43, 57, 99, 65, 74, 88],
'y':[10, 50, 31, 66, 57, 199, 75, 80, 100]})
which looks like:
x y
0 10 10
1 22 50
2 31 31
3 43 66
4 57 57
5 99 199
6 65 75
7 74 80
8 88 100
To produce the following output:
x y z
10 10 10
22 50 28
31 31 31
43 66 23
57 57 57
99 199 100
65 75 10
74 80 6
88 100 12
I have attempted the following but this returns a syntax error.
z = if(y - x) == 0: a else: (y - x)
I appreciate there are similar questions but I haven't found anything that is applicable to my use case or has a sufficient explanation such that I can repurpose the code.
Upvotes: 1
Views: 239
Reputation:
You can use np.where
:
import numpy as np
df['z'] = np.where(df['x']==df['y'], df['x'], df['y'] - df['x'])
Here, the condition df['x']==df['y']
creates a boolean Series of length len(df)
ordered in the same order as df
, i.e. with the same index. Then wherever True, you take the value from df['x']
in the corresponding index and wherever False, you take the value from df['y'] - df['x']
in the corresponding index. As it's clear from the previous sentence, all three Series must have the same length (or must be broadcastable).
The output we get (that we assign to df['z']
) is and array of length len(df)
with elements from df['x']
where our condition is True, and elements from df['y'] - df['x']
otherwise.
From the docs:
If all the arrays are 1-D, where is equivalent to:
[xv if c else yv for c, xv, yv in zip(condition, x, y)]
and since we have 1-D arrays here, we could also use
df['z'] = [x if c else y_minus_x
for c, x, y_minus_x in zip(df['x']==df['y'], df['x'], df['y'] - df['x'])]
for the same results.
Output:
x y z
0 10 10 10
1 22 50 28
2 31 31 31
3 43 66 23
4 57 57 57
5 99 199 100
6 65 75 10
7 74 80 6
8 88 100 12
Upvotes: 4