Bryn M.
Bryn M.

Reputation: 45

Pandas Datetime column Filtering Issue when using variables

I'm trying to filter a pandas column based on the date in one of my dataframe columns. So for example I have one column called 'Date', that has been converted to datetime using

df['Date'] = pd.to_datetime(df['Date'])

Placing it in the format 2019-06-01 for example. Now I can perform a filter on the column, so if I wanted to get dates only in June I could do

df[(df['Date'] >= '2019-06-01') & (df['Date'] <= '2019-06-30')]

And this works just fine, comparing the datetime to a string, which I assume pandas converts to a datetime automatically to perform the comparison.

However, this stops working as soon as I assign the comparison string to a variable, so if I do this

start = '2019-06-01'
end = '2019-06-30'
df[(df['Date'] >= start) & (df['Date'] <= end)]

I get an error: TypeError: Invalid comparison between dtype=datetime64[ns] and str

Any ideas on why this may be occurring?

Upvotes: 0

Views: 892

Answers (1)

Valdi_Bo
Valdi_Bo

Reputation: 30991

I use Pandas version 0.25 and Python version 3.7.0.

I checked your code:

start = '2019-06-01'
end = '2019-06-30'
df[(df['Date'] >= start) & (df['Date'] <= end)]

getting proper result (no error).

If you use some older version of either Python or Pandas, consider upgrading them.

I checked also other variants of code:

  1. Conversion of "border" values to datetime:

    d1 = pd.to_datetime('2019-06-01')
    d2 = pd.to_datetime('2019-06-30')
    df[df.Date.between(d1, d2)]
    
  2. Usage of between with both arguments as strings:

    df[df.Date.between('2019-06-01', '2019-06-30')]
    

getting also proper result. Check them on your installation as it is now and after upgrade (if you decide to do it).

Upvotes: 1

Related Questions