Justin Benfit
Justin Benfit

Reputation: 483

Datetime string needs to be converted to datetime object before matching on pandas column

Looking for clarification. I've seen several comments in SO posts saying emphatically that you can do a greater than less than comparison with a datetime column and a string formatted like a datetime object. I am finding this to be false so I was wondering if anyone could indeed confirm that this is not possible.

Here I have an example:

# between_dates_sales_seed_ae_sales_plan_ramped_date__current_date_interval_1_year____1_1_2000_
dat = '12/1/2000'
# between_dates_sales_seed_capacity_plan_by_rep_ramped_date__current_date_interval_1_year____1_1_2000_ 
ae_long['ramped_date'] = pd.to_datetime(ae_long['ramped_date'], errors='coerce').dt.strftime('%-m/%-d/%Y')
try:
    assert len(ae_long.loc[(ae_long['ramped_date'] > dat)]) == 0
except:
    print(ae_long.loc[(ae_long['ramped_date'] < dat)])

This returns many dates in ramped_date that are clearly greater than dat

            date  salesforce_user_id original_start_date ramped_date  \
3      1/31/2022  0051a000002Gxxxx           5/15/2018  10/31/2018   
14     1/31/2022  0051xxxxxxxxxxxxx          5/11/2019   1/31/2020   
15     1/31/2022  xxxxxxxxxxxxxxxxxxxxx      7/8/2019   1/31/2020   
16     1/31/2022  xxxxxxxxxxxxxxxx           9/16/2019   1/31/2020   

Is the only solution to convert dat to a datetime object? Thanks

Upvotes: 0

Views: 36

Answers (1)

Seon
Seon

Reputation: 3975

The comparison between strings is the same thing as the comparison between dates if and only if both your dates are formatted as YYYY-MM-DD, as comparison on strings happens in lexicographic order.

With a MM-DD-YYYY formatting, you get cases such as: 06/01/1999 > 05/01/2022, as 6>5.

Upvotes: 1

Related Questions