Reputation: 816
Suppose I have two datetime series:
foo = pd.to_datetime(pd.Series([
'2020-01-01 12:00:00',
'2020-02-02 23:12:00'
]))
bar = pd.to_datetime(pd.Series([
'2020-01-20 01:02:03',
'2020-01-30 03:02:01'
]))
Both are of type datetime64[ns]:
>>> foo
0 2020-01-01 12:00:00
1 2020-02-02 23:12:00
dtype: datetime64[ns]
>>> bar
0 2020-01-20 01:02:03
1 2020-01-30 03:02:01
dtype: datetime64[ns]
For each element in foo
, I want to get the minimum of:
foo
bar
But this produces a TypeError
:
>>> np.minimum(foo, bar.max())
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
...
TypeError: '<=' not supported between instances of 'int' and 'Timestamp'
It works if I just do the Series
themselves:
>>> np.minimum(foo, bar)
0 2020-01-01 12:00:00
1 2020-01-30 03:02:01
dtype: datetime64[ns]
bar.max()
returns a Timestamp
for some reason, instead of a datetime64
, but even using an explicit python datetime
object doesn't work. Why is numpy considering foo
to be an int
? Is there a way around this?
Upvotes: 1
Views: 1610
Reputation: 839
How about
>>> barmax = bar.max()
>>> barmax
Timestamp('2020-01-30 03:02:01')
>>> foo.map(lambda x: np.minimum(x, barmax))
0 2020-01-01 12:00:00
1 2020-01-30 03:02:01
dtype: datetime64[ns]
>>>
Upvotes: 0
Reputation: 2776
Using pandas.Series.where
:
foo.where(foo < bar.max(), bar.max())
This replaces values of foo
with bar.max()
if the condition (foo < bar.max())
is False
.
Upvotes: 3