Reputation: 14699
I need to compute a column where the value is the result of a vectorized operation over other columns:
df["new_col"] = df["col1"] - min(0,df["col2"])
It turned out, however, that I cannot use min as in the above syntax. So, what is the right way to get the min between zero and a given value of pandas column?
Upvotes: 7
Views: 4771
Reputation: 81664
I think that the other answers aren't what you meant. They take the minimum value in df['col2']
and compare it to 0
(and thus always return the same value), while you wanted the minimum between each value in col2
and 0
:
df = pd.DataFrame(data={'a': [2, 3], 'b': [-1, 1]})
df['new_col'] = map(lambda a, b: a - min(0, b), df['a'], df['b'])
print df
>> a b new_col
0 2 -1 3
1 3 1 3
Upvotes: 0
Reputation: 32224
You could use some masking and a temporary column. Totally ignoring the 'min' function.
magicnumber = 0
tempcol = df['col2']
mask = tempcol < magicnumber
tempcol.loc[df[~mask].index] = magicnumber
df['col1'] - tempcol
Or you can use a lambda function:
magicnumber = 0
df['col1'] - df['col2'].apply(lambda x: np.min(magicnumber, x))
OR you can apply over two columns:
df['magicnumber'] = 0
df['col1'] - df[['col2', 'magicnumber']].apply(np.min, axis=1)
Upvotes: 1
Reputation: 69183
you can use numpy.minimum
to find the element-wise minimum of an array
import numpy as np
df["new_col"] = df["col1"] - np.minimum(0,df["col2"])
Upvotes: 9