jimiclapton
jimiclapton

Reputation: 889

Conditional sum of dataframe columns in Python

Keen to know if it's possible to reproduce the following logic in Python, in one line, without creating a function.

z = IF (y-x) = 0 THEN a ELSE (y-x)

Using this df:

df = pd.DataFrame({'x': [10, 22, 31, 43, 57, 99, 65, 74, 88],
              'y':[10, 50, 31, 66, 57, 199, 75, 80, 100]})

which looks like:

    x    y
0  10   10
1  22   50
2  31   31
3  43   66
4  57   57
5  99  199
6  65   75
7  74   80
8  88  100

To produce the following output:

 x  y   z
10  10  10
22  50  28
31  31  31
43  66  23
57  57  57
99  199 100
65  75  10
74  80  6
88  100 12

I have attempted the following but this returns a syntax error.

z = if(y - x) == 0: a else: (y - x)

I appreciate there are similar questions but I haven't found anything that is applicable to my use case or has a sufficient explanation such that I can repurpose the code.

Upvotes: 1

Views: 239

Answers (1)

user7864386
user7864386

Reputation:

You can use np.where:

import numpy as np
df['z'] = np.where(df['x']==df['y'], df['x'], df['y'] - df['x'])

Here, the condition df['x']==df['y'] creates a boolean Series of length len(df) ordered in the same order as df, i.e. with the same index. Then wherever True, you take the value from df['x'] in the corresponding index and wherever False, you take the value from df['y'] - df['x'] in the corresponding index. As it's clear from the previous sentence, all three Series must have the same length (or must be broadcastable).

The output we get (that we assign to df['z']) is and array of length len(df) with elements from df['x'] where our condition is True, and elements from df['y'] - df['x'] otherwise.

From the docs:

If all the arrays are 1-D, where is equivalent to:

[xv if c else yv
 for c, xv, yv in zip(condition, x, y)]

and since we have 1-D arrays here, we could also use

df['z'] = [x if c else y_minus_x 
           for c, x, y_minus_x in zip(df['x']==df['y'], df['x'], df['y'] - df['x'])]

for the same results.

Output:

    x    y    z
0  10   10   10
1  22   50   28
2  31   31   31
3  43   66   23
4  57   57   57
5  99  199  100
6  65   75   10
7  74   80    6
8  88  100   12

Upvotes: 4

Related Questions