Reputation: 322
I am trying to filter a Pandas df by dates (today and yesterday). For automation purposes I wish to filter using a timestamp function. This is pretty seamless in R:
df %>%
filter(date >= today() - 1)
However, my attempts to replicate in Pandas are not reaching any success so far: Yesterday comes out fine, but .query() doesnt recognise it?
yesterday = (date.today() - timedelta(days=6)).strftime('%Y-%m-%d')
df.\
query('date >= yesterday')
Ideally I am seeking something all encompassing like:
df.\
query('date >= (date.today() - timedelta(days=6)).strftime('%Y-%m-%d')')
Upvotes: 4
Views: 2906
Reputation: 543
Try: df.query('date >= @yesterday')
. You need @
so pandas recognizes it's a variable.
Upvotes: 4
Reputation: 23099
IIUC, you want to create an outside varible to use inside your query?
from the docs
You can refer to variables in the environment by prefixing them with an ‘@’ character like @a + b.
using pandas only
import pandas as pd
df = pd.DataFrame({'date' : pd.date_range('01-02-2020','01-03-2021',freq='D')})
df = df.set_index('date')
delta = (pd.Timestamp('today') - pd.DateOffset(days=1)).strftime('%d-%m-%y')
df.query(f"date <= @delta")
date
2020-01-02
2020-01-03
2020-01-04
2020-01-05
2020-01-06
Upvotes: 3
Reputation: 29635
you can do it with string formatting:
df.query(f'date>= "{pd.Timestamp.today() - pd.Timedelta(days=6)}"')
Note: I tried with pd.Timestamp
and pd.Timedelta
but I'm sure it will work with date
and timedelta
as you used
Upvotes: 2