HMLDude
HMLDude

Reputation: 1637

Conditionally altering a dataframe column based on another column's values

I have the following pandas dataframe d1:

+----------+-------+---------+--------------+
| Item Num | Cost  | Revenue |  Rev / Cost  |
+----------+-------+---------+--------------+
|        1 | 45.76 |  345.67 | 7.5539772727 |
|        2 | 55.78 |  456.92 | 8.1914664754 |
|        3 | 34.68 |       0 |            0 |
|        4 | 79.85 |       0 |            0 |
+----------+-------+---------+--------------+

What I would like is for the values of the Cost / Rev column to be equal to the Cost for that row, multiplied by negative 1, in cases where the 'Cost / Rev' is equal to 0.

So the desired output would be:

+----------+-------+---------+--------------+
| Item Num | Cost  | Revenue |  Rev / Cost  |
+----------+-------+---------+--------------+
|        1 | 45.76 |  345.67 | 7.5539772727 |
|        2 | 55.78 |  456.92 | 8.1914664754 |
|        3 | 34.68 |       0 |       -34.68 |
|        4 | 79.85 |       0 |       -79.85 |
+----------+-------+---------+--------------+

What I have so far is:

d1['Rev / Cost'] = d1['Rev / Cost'].apply(lambda x: x if x > 0 else d1['Cost'])

Which simply overwrites the intended range with a single value and throws the following warning:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

Upvotes: 0

Views: 132

Answers (2)

Ted Petrou
Ted Petrou

Reputation: 61947

Since booleans evaluate to 0/1, you can simply multiply the condition by the Cost and subtract it from Rev / Cost. This gives a nice performance boost.

df['Rev / Cost'] -=  df['Cost'] * (df['Rev / Cost'] == 0)

You can also use np.where

df['Rev / Cost'] = np.where(df['Rev / Cost'] == 0, -df['Cost'], df['Rev / Cost']

Or Series.where

df['Rev / Cost'] = df['Rev / Cost'].where(lambda x: x != 0, df.Cost)

Upvotes: 1

cs95
cs95

Reputation: 402353

Create a mask and then use loc to assign to a subslice.

mask = df['Rev / Cost'] == 0
df.loc[mask, 'Rev / Cost'] = df.loc[mask, 'Cost'].mul(-1)

Upvotes: 2

Related Questions