Robert Chestnutt
Robert Chestnutt

Reputation: 322

How to filter a Pandas dataframe by timestamp functon using .query()

I am trying to filter a Pandas df by dates (today and yesterday). For automation purposes I wish to filter using a timestamp function. This is pretty seamless in R:

df %>% 
  filter(date >= today() - 1)

However, my attempts to replicate in Pandas are not reaching any success so far: Yesterday comes out fine, but .query() doesnt recognise it?

yesterday = (date.today() - timedelta(days=6)).strftime('%Y-%m-%d')
df.\
   query('date >= yesterday')

Ideally I am seeking something all encompassing like:

df.\
   query('date >= (date.today() - timedelta(days=6)).strftime('%Y-%m-%d')')

Upvotes: 4

Views: 2906

Answers (3)

Ugurite
Ugurite

Reputation: 543

Try: df.query('date >= @yesterday'). You need @ so pandas recognizes it's a variable.

Upvotes: 4

Umar.H
Umar.H

Reputation: 23099

IIUC, you want to create an outside varible to use inside your query?

from the docs

You can refer to variables in the environment by prefixing them with an ‘@’ character like @a + b.

using pandas only

import pandas as pd

df = pd.DataFrame({'date' : pd.date_range('01-02-2020','01-03-2021',freq='D')})
df = df.set_index('date')

delta = (pd.Timestamp('today') - pd.DateOffset(days=1)).strftime('%d-%m-%y')

df.query(f"date <= @delta")

  date
  2020-01-02
  2020-01-03
  2020-01-04
  2020-01-05
  2020-01-06

Upvotes: 3

Ben.T
Ben.T

Reputation: 29635

you can do it with string formatting:

df.query(f'date>= "{pd.Timestamp.today() - pd.Timedelta(days=6)}"')

Note: I tried with pd.Timestamp and pd.Timedelta but I'm sure it will work with date and timedelta as you used

Upvotes: 2

Related Questions