anakaine
anakaine

Reputation: 1248

Using Python + Pandas to calculate a value where 2 conditions from other columns are met

I have a pandas dataframe, and I'm trying to locate rows where two conditions are met, then calculate a value.

The basic premise is:

I need all other rows to stay the same and remain in the dataframe.

Currently I'm trying to use the following:

df['ADJ_INT_WEIGHT_FUEL'] = (df['PreClear_Fuel_Adj']*100).where(df['PreClear_Fuel_Adj'] == 12 & df['ADJ_INT_WEIGHT_FUEL'] <= 1200)

Example
Original

+---------------------+-------------------+
| ADJ_INT_WEIGHT_FUEL | PreClear_Fuel_Adj |
+---------------------+-------------------+
|                  10 |                 0 |
|                  30 |                12 | <-- Will be identified by .where()
|                  20 |                 0 |
|                   5 |                 0 |
|                  15 |                12 | <-- Will be identified by .where()
|                  25 |                 0 |
|                3500 |                12 |
+---------------------+-------------------+

Calculated

+---------------------+-------------------+
| ADJ_INT_WEIGHT_FUEL | PreClear_Fuel_Adj |
+---------------------+-------------------+
|                  10 |                 0 |
|                1200 |                12 | <-- Was identified by .where()
|                  20 |                 0 |
|                   5 |                 0 |
|                1200 |                12 | <-- Was identified by .where()
|                  25 |                 0 |
|                3500 |                12 |
+---------------------+-------------------+

Issues I'm currently receiving an error. Pandas is being used within ArcGIS Pro, so there will be some ArcGIS Pro error messages in the below dump, too. Generally speaking, however, we can treat this exame as a generic pandas dataframe.

It either seems like I have an issue with data types, or with using the & operator. That said, I'm not sure my selection and calculation syntax is actually correct. So I could use some feedback if there's an error there.

Thanks!

Traceback (most recent call last):
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\pandas\core\ops\array_ops.py", line 274, in na_logical_op
    result = op(x, y)
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\pandas\core\ops\roperator.py", line 52, in rand_
    return operator.and_(right, left)
**TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''**

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\pandas\core\ops\array_ops.py", line 288, in na_logical_op
    result = libops.scalar_binop(x, y, op)
  File "pandas\_libs\ops.pyx", line 169, in pandas._libs.ops.scalar_binop
**ValueError: Buffer dtype mismatch, expected 'Python object' but got 'double'**

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 479, in execute
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\pandas\core\ops\common.py", line 64, in new_method
    return method(self, other)
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\pandas\core\ops\__init__.py", line 552, in wrapper
    res_values = logical_op(lvalues, rvalues, op)
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\pandas\core\ops\array_ops.py", line 366, in logical_op
    res_values = na_logical_op(lvalues, rvalues, op)
  File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\pandas\core\ops\array_ops.py", line 298, in na_logical_op
    f"Cannot perform '{op.__name__}' with a dtyped [{x.dtype}] array "
TypeError: Cannot perform 'rand_' with a dtyped [float64] array and scalar of type [bool]
 Failed to execute (Tool).

Upvotes: 1

Views: 89

Answers (2)

BENY
BENY

Reputation: 323226

Try

m = (df.PreClear_Fuel_Adj == 12) & (df.ADJ_INT_WEIGHT_FUEL <= 1200)
df['ADJ_INT_WEIGHT_FUEL'] *= m.astype(int)*100

Or

df.loc[m, 'ADJ_INT_WEIGHT_FUEL'] = df['ADJ_INT_WEIGHT_FUEL']*100

Upvotes: 2

wwnde
wwnde

Reputation: 26676

Use np.where(condition, answer if condition, answer if not condition)

import numpy as np
df['ADJ_INT_WEIGHT_FUEL']=np.where((df.PreClear_Fuel_Adj == 12)&(df.ADJ_INT_WEIGHT_FUEL <= 1200),\
                          df.PreClear_Fuel_Adj*100,df.ADJ_INT_WEIGHT_FUEL)
print(df)
    



 ADJ_INT_WEIGHT_FUEL     PreClear_Fuel_Adj
0                   10                  0
1                 1200                 12
2                   20                  0
3                    5                  0
4                 1200                 12
5                   25                  0
6                 3500                 12

Upvotes: 1

Related Questions