Reputation: 327
This code should print only the rows where the date in the colum 'dataseparada' is higher then the initial date(datai) and lower than the end date (dataf). But many dates out of the date range are printed as well. The used format is dd/mm/Y (30/12/2020).
import pandas as pd
df=pd.read_csv('tst.csv')
datai='10/10/2020'
dataf='11/11/2020'
df=df[(df.dataseparada >= datai) & (df.dataseparada <= dataf)]
print(df.to_string())
'Tst' csv file :
The Output prints all rows from tst file, when it should print only the rows where datasepatada is > 10/10/2020 and < 11/11/2020. As i change datei and datef the output is diferent, but aways wrong.
I'm clueless abt why it dosent work. The code work for filter values that arent dates.
edit-
dtypes of columns :
nome object
datajunta int64
preco float64
dataseparada object
Upvotes: 0
Views: 41
Reputation: 110
Friend, the advice that I can give you and I think there is this, the error is that to compare a datatime with a string it does not work correctly you would have to validate the type of data that is the df.dataseparate
and then convert your datai
and dataf
string into datatime.
Upvotes: 1
Reputation: 323366
Please change your datetime back to datetime format
df = pd.read_csv('tst.csv')
s = pd.to_datetime(df['dataseparada'])
datai = '2020-10-10'
dataf = '2020-11-11'
df = df[(s >= datai) & (s <= dataf)]
Upvotes: 2