Reputation: 169
I need help creating a function to change the value of certain rows in certain columns.
Considering the following dataframe:
serie = [0, 1, 2, 0, 1, 2, 0, 1, 2]
dataX = [0.1, 0.24, 0.21, 0.1, 0.25, 0.2, 0.2, 0.38, 0.49]
dataY = [0.1, 0.23, 0.21, 0.1, 0.27, 0.2, 0.2, 0.38, 0.49]
dataZ = [0.1, 0.26, 0.21, 0.1, 0.25, 0.2, 0.2, 0.49, 0.59]
dataW = [0.1, 0.23, 0.21, 0.1, 0.28, 0.2, 0.2, 0.49, 0.59]
my_dict = {'serie': serie,
'dataX': dataX,
'dataY': dataY,
'dataZ': dataW,
'dataW': dataZ}
df_serialized = pd.DataFrame.from_dict(my_dict)
I need to change the dataY and dataZ columns to zero whenever the value of the series column is zero.
What I already tried:
df_serialized[df_serialized.serie == 0][['dataY', 'dataZ']].apply(np.zeros)
returns the following error message:
TypeError Traceback (most recent call last) in () ----> 1 df_serialized[df_serialized.serie == 0][['dataY', 'dataZ']].apply(np.zeros)
3 frames /usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, result_type, args, **kwds) 6485 args=args, 6486 kwds=kwds) -> 6487 return op.get_result() 6488 6489 def applymap(self, func):
/usr/local/lib/python3.6/dist-packages/pandas/core/apply.py in get_result(self) 149 return self.apply_raw() 150 --> 151 return self.apply_standard() 152 153 def apply_empty_result(self):
/usr/local/lib/python3.6/dist-packages/pandas/core/apply.py in apply_standard(self) 255 256 # compute the result using the series generator --> 257 self.apply_series_generator() 258 259 # wrap results
/usr/local/lib/python3.6/dist-packages/pandas/core/apply.py in apply_series_generator(self) 284 try: 285 for i, v in enumerate(series_gen): --> 286 results[i] = self.f(v) 287 keys.append(v.name) 288 except Exception as e:
TypeError: ("'numpy.float64' object cannot be interpreted as an integer", 'occurred at index dataY')
Upvotes: 0
Views: 59
Reputation: 407
df_serialized.loc[df_serialized.serie == 0, 'dataY'] = 0
df_serialized.loc[df_serialized.serie == 0, 'dataZ'] = 0
or at once
df_serialized.loc[df_serialized.serie == 0, ['dataZ', 'dataY']] = 0
Upvotes: 2