bohnpessatti
bohnpessatti

Reputation: 120

Filter pandas dataframe based on date columns difference

What is the most efficient way to filter a pandas dataframe based on a difference between two date columns?

For example, based on the following dataframe:

   CADASTRO    RESPOSTA      EVAL 
0  2021-06-01  2021-06-13    y
1  2021-06-01  2021-06-13    y
2  2021-06-01  2021-06-18    y
3  2021-06-01  2021-06-09    n
4  2021-06-01  2021-06-20    n
5  2021-06-01  2021-06-20    n

How can I filter it to have only the records in which the difference between the columns RESPOSTA and CADASTRO is lower than 15 days? I tried the following, unsucessfully:

import datetime
filtered_df = df[(df.RESPOSTA - df.CADASTRO).days < 15]

The desired output is:

   CADASTRO    RESPOSTA      EVAL
0  2021-06-01  2021-06-13    y
1  2021-06-01  2021-06-13    y
3  2021-06-01  2021-06-09    n

Upvotes: 1

Views: 361

Answers (2)

blackraven
blackraven

Reputation: 5597

Use timedelta to compare the days difference

from datetime import timedelta

df = pd.DataFrame({
    'CADASTRO': ['2021-06-01', '2021-06-01', '2021-06-01', '2021-06-01', '2021-06-01', '2021-06-01'], 
    'RESPOSTA': ['2021-06-13', '2021-06-13', '2021-06-18', '2021-06-09', '2021-06-20', '2021-06-20'],
    'EVAL': ['y', 'y', 'y', 'n', 'n', 'n']
})

df['CADASTRO'] = pd.to_datetime(df['CADASTRO'])
df['RESPOSTA'] = pd.to_datetime(df['RESPOSTA'])
df['temp'] = df['RESPOSTA'] - df['CADASTRO']
df['temp'] = df['temp'].apply(lambda x: 0 if x < timedelta(days=15) else 1)
filtered_df = df.drop(df[df['temp']==0].index).drop(columns=['temp'])

Output filtered_df

enter image description here

Upvotes: 1

Henry Ecker
Henry Ecker

Reputation: 35626

Access days through the datetime accessor dt

# Ensure DateTime
df['CADASTRO'] = pd.to_datetime(df['CADASTRO'])
df['RESPOSTA'] = pd.to_datetime(df['RESPOSTA'])
# Access Days through dt.days
filtered_df = df[(df.RESPOSTA - df.CADASTRO).dt.days < 15]

filtered_df:

    CADASTRO   RESPOSTA EVAL
0 2021-06-01 2021-06-13    y
1 2021-06-01 2021-06-13    y
3 2021-06-01 2021-06-09    n

Upvotes: 5

Related Questions