Vitor Oliveira
Vitor Oliveira

Reputation: 327

Dataframe's date is beeing filtered but with flaws

This code should print only the rows where the date in the colum 'dataseparada' is higher then the initial date(datai) and lower than the end date (dataf). But many dates out of the date range are printed as well. The used format is dd/mm/Y (30/12/2020).

import pandas as pd
df=pd.read_csv('tst.csv')
datai='10/10/2020'
dataf='11/11/2020'
df=df[(df.dataseparada >= datai) & (df.dataseparada <= dataf)]
print(df.to_string())

'Tst' csv file :

enter image description here

The Output prints all rows from tst file, when it should print only the rows where datasepatada is > 10/10/2020 and < 11/11/2020. As i change datei and datef the output is diferent, but aways wrong.

I'm clueless abt why it dosent work. The code work for filter values that arent dates.

edit-

dtypes of columns :

nome                        object
datajunta                    int64
preco                      float64
dataseparada                object

Upvotes: 0

Views: 41

Answers (2)

Carlos Chavita
Carlos Chavita

Reputation: 110

Friend, the advice that I can give you and I think there is this, the error is that to compare a datatime with a string it does not work correctly you would have to validate the type of data that is the df.dataseparate and then convert your datai and dataf string into datatime.

Upvotes: 1

BENY
BENY

Reputation: 323366

Please change your datetime back to datetime format

df = pd.read_csv('tst.csv')
s = pd.to_datetime(df['dataseparada'])

datai = '2020-10-10'
dataf = '2020-11-11'
df = df[(s >= datai) & (s <= dataf)]

Upvotes: 2

Related Questions