germannp
germannp

Reputation: 197

Numpy.minimum with Pandas.Series of Timestamps TypeError: Cannot compare 'Timestamp' with 'int'

I would like to vectorize the computation of the overlap of time intervals using np.minimum on pd.Series:

np.minimum(
    pd.to_datetime('2018-01-16 21:43:00'),
    pd.Series([pd.to_datetime('2018-01-16 21:44:00'), pd.to_datetime('2018-01-16 21:41:00')]))

However, this results in the following TypeError:

TypeError                                 Traceback (most recent call last)
<ipython-input-84-07083aa6dce1> in <module>()
    1 np.minimum(
    2     pd.to_datetime('2018-01-16 21:43:00'),
----> 3     pd.Series([pd.to_datetime('2018-01-16 21:44:00'), pd.to_datetime('2018-01-16 21:41:00')]))

pandas\_libs\tslibs\timestamps.pyx in pandas._libs.tslibs.timestamps._Timestamp.__richcmp__()

TypeError: Cannot compare type 'Timestamp' with type 'int'

Using a np.array works like a charm (using .values not):

np.minimum(
    pd.Series([1, pd.to_datetime('2018-01-16 21:43:00')])[1], 
    np.array([pd.to_datetime('2018-01-16 21:44:00'), pd.to_datetime('2018-01-16 21:41:00')]))

Any ideas?

Upvotes: 1

Views: 846

Answers (1)

Ben.T
Ben.T

Reputation: 29635

To do short, instead of using pd.to_datetime to create the upper bound, use np.datetime64

s = pd.Series([pd.to_datetime('2018-01-16 21:44:00'), pd.to_datetime('2018-01-16 21:41:00')])
print (np.minimum(s, np.datetime64('2018-01-16 21:43:00')))
0   2018-01-16 21:43:00
1   2018-01-16 21:41:00
dtype: datetime64[ns]

or even this np.minimum(s, pd.to_datetime('2018-01-16 21:43:00').to_datetime64()) works.

To see a bit more: If you have a look at both dtype or even the element representation of the two way you create your data, you can see the differences.

print (s.values)
array(['2018-01-16T21:44:00.000000000', '2018-01-16T21:41:00.000000000'],
      dtype='datetime64[ns]')
print (np.array([pd.to_datetime('2018-01-16 21:44:00'), pd.to_datetime('2018-01-16 21:41:00')]))
array([Timestamp('2018-01-16 21:44:00'), Timestamp('2018-01-16 21:41:00')],
      dtype=object)

One way interesting is to change the type of s.values such as:

print (np.minimum(s.values.astype('datetime64[s]'), 
                  pd.to_datetime('2018-01-16 21:43:00')))
array([Timestamp('2018-01-16 21:43:00'),
       datetime.datetime(2018, 1, 16, 21, 41)], dtype=object)

it works but you can see that one is a Timestamp and the other one is datetime, it seems that when the type of s.values is datetime[ns] the comparison is not possible while datetime64[s] or even datetime64[ms] it does.

Also have a look at this answer, it may help.

Upvotes: 1

Related Questions