OnceUponATime
OnceUponATime

Reputation: 488

Matching Timestamps and DatetimeIndex in Pandas

I am working on a script for historians that allows them to compare data from different EXCEL tables with data input by the users running the scripts.

The users generally enter dates like this:

1765-04-05

The input by the users is then converted to DateTime in Pandas as follows:

year=input()
y = pd.to_datetime(year)

This creates a <class 'pandas._libs.tslibs.timestamps.Timestamp'> and can be printed as:

Searching for events with exact date 1760-01-01 00:00:00 !

As no seconds, minutes or hours were indicated, they are obviously all 0. Now we have got EXCEL tables with dates in YYYY-MM-DD format which also do not have hours, minutes or seconds. I am trying to convert them in analogy to the user input:

time=pd.to_datetime(f['event_start'].values)

When I check what the script is reading from the indicated column and what type it is, I get this:

DatetimeIndex(['1739-01-01', '1740-01-01', '1741-01-01', '1742-01-01',
               '1744-01-01', '1752-01-01', '1753-01-01', '1755-01-01',
               '1756-01-01', '1757-01-01',
               ...
                      'NaT',        'NaT',        'NaT',        'NaT',
                      'NaT',        'NaT',        'NaT',        'NaT',
                      'NaT',        'NaT'],
              dtype='datetime64[ns]', length=2228, freq=None)
<class 'pandas.core.indexes.datetimes.DatetimeIndex'

So the type of the converted input and the type of the converted EXCEL values are not the same. What can I do to extract the time cells from EXCEL in a format that I can directly compare with the user input?

Upvotes: 0

Views: 1272

Answers (1)

TheFaultInOurStars
TheFaultInOurStars

Reputation: 3608

If I have correctly understood, you are getting string from excels, which is not the same type as the event_start column of your dataframe, which, to me, actually seems fine! You can compare a NumPy array of DatetimeIndex with a string. Take the code below as an example:

import pandas as pd
time = pd.DatetimeIndex(['1739-01-01', '1740-01-01', '1741-01-01', '1742-01-01',
               '1744-01-01', '1752-01-01', '1753-01-01', '1755-01-01',
               '1756-01-01', '1757-01-01'],
              dtype='datetime64[ns]', freq=None)
time == '1753-01-01'

which results in:

array([False, False, False, False, False, False,  True, False, False,
       False])

As you can see, the seventh element comes True. If you are more interested in converting the string to an date object, I suggest you to use dateutil.parser.parse exactly as what follows:

import pandas as pd
from dateutil.parser import parse
time = pd.DatetimeIndex(['1739-01-01', '1740-01-01', '1741-01-01', '1742-01-01',
               '1744-01-01', '1752-01-01', '1753-01-01', '1755-01-01',
               '1756-01-01', '1757-01-01'],
              dtype='datetime64[ns]', freq=None)
time == parse(input('Enter a date string here: '))

Result:

Enter a date string here: 1755-01-01
array([False, False, False, False, False, False, False,  True, False,
       False])

Upvotes: 1

Related Questions