Reputation: 79
I am trying to search for specific values in either of two columns and when a target value is found, change the number in a third column from positive to negative or negative to positive.
te1 = df.loc[df['Transaction Event'] == 'Exercise']
te2 = df.loc[df['Transaction Event'] == 'Assignment']
te3 = df.loc[df['Transaction Event'] == 'Expiration']
an1 = df.loc[df['Action'] == 'Delete']
nq = df['Net Quantity']
var1 = df[(df['Transaction Event'] == 'Exercise') | (df['Transaction Event'] == 'Assignment') | (df['Transaction Event'] == 'Expiration') | (df['Action'] == 'Delete')]
df.loc[df[var1], nq] = df.loc[df[var1], nq] * -1
Running this code returns the following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-282-01dbb8066276> in <module>()
6 var1 = df[(df['Transaction Event'] == 'Exercise') | (df['Transaction Event'] == 'Assignment') | (df['Transaction Event'] == 'Expiration') | (df['Action'] == 'Delete')]
7
----> 8 df.loc[df[var1], nq] = df.loc[df[var1], nq] * -1
9 print(df)
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
1958 return self._getitem_array(key)
1959 elif isinstance(key, DataFrame):
-> 1960 return self._getitem_frame(key)
1961 elif is_mi_columns:
1962 return self._getitem_multilevel(key)
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in _getitem_frame(self, key)
2034 if key.values.size and not is_bool_dtype(key.values):
2035 raise ValueError('Must pass DataFrame with boolean values only')
-> 2036 return self.where(key)
2037
2038 def query(self, expr, inplace=False, **kwargs):
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in where(self, cond, other, inplace, axis, level, try_cast, raise_on_error)
5338 other = com._apply_if_callable(other, self)
5339 return self._where(cond, other, inplace, axis, level, try_cast,
-> 5340 raise_on_error)
5341
5342 @Appender(_shared_docs['where'] % dict(_shared_doc_kwargs, cond="False",
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in _where(self, cond, other, inplace, axis, level, try_cast, raise_on_error)
5096 for dt in cond.dtypes:
5097 if not is_bool_dtype(dt):
-> 5098 raise ValueError(msg.format(dtype=dt))
5099
5100 cond = cond.astype(bool, copy=False)
ValueError: Boolean array expected for the condition, not float64
Does anyone know what is causing this error?
Upvotes: 5
Views: 7071
Reputation: 2795
You're not creating a mask, you're selecting a subset of your df
when you do this:
var1 = df[(df['Transaction Event'] == 'Exercise') | (df['Transaction Event'] == 'Assignment') | (df['Transaction Event'] == 'Expiration') | (df['Action'] == 'Delete')]
Instead you need just this:
var1 = (df['Transaction Event'] == 'Exercise') | (df['Transaction Event'] == 'Assignment') | (df['Transaction Event'] == 'Expiration') | (df['Action'] == 'Delete')
In your current code you create the boolean array that you want, but also additionally index in to your original df
with that array. You can confirm if you look at what's actually contained in var1
for your current code.
Upvotes: 5