Reputation: 1070
I have the following dict
and pandas DataFrame
.
sample_dict = {'isDuplicate': {'1051681551': False, '1037545402': True, '1035390559': False},
'dateTime': {'1051681551': Timestamp('2019-01-29 09:09:00+0000', tz='UTC'),
'1037545402': Timestamp('2019-01-11 02:06:00+0000', tz='UTC'),
'1035390559': Timestamp('2019-01-08 14:35:00+0000', tz='UTC')},
'dateTimePub': {'1051681551': None, '1037545402': None, '1035390559': None}}
df = pd.DataFrame.from_dict(sample_dict)
I want to apply a np.where()
function to dateTime
and dateTimePub
columns like:
def _replace_datetime_with_datetime_pub(news_dataframe):
news_dataframe.dateTime = np.where(news_dataframe.dateTimePub, news_dataframe.dateTimePub, news_dataframe.dateTime)
return pd.to_datetime(news_dataframe.dateTime)
df.apply(_replace_datetime_with_datetime_pub)
But I get the following error,
AttributeError: 'Series' object has no attribute 'dateTimePub'
It's possible to do df = _replace_datetime_with_datetime_pub(df)
. But my question is,
how to apply this function via either pd.DataFrame.apply
or pd.DataFrame.transform
method, and
why do I get this error?
I have already checked many other similar questions, but none of them had AttributeError
.
Upvotes: 0
Views: 158
Reputation: 36
With apply
, you're breaking down your DataFrame into series to pass to your function. Since you don't specify the axis keyword argument, pandas assumes you want to pass each column as a series. This is the source of the AttributeError
you're getting. For pandas to pass each row as a series, you want to specify axis=1
in your apply
call.
Even then, you'll need to adapt the function some to fit it into the apply
paradigm. In particular, you want to think of how the function should process each row it encounters. The function you pass to apply
(if you specify axis=1
) will work on each row in isolation from every other row. The return value from each row will then be stitched together to return a series.
Upvotes: 1