The Singularity
The Singularity

Reputation: 2698

Filtering Dataframe based on Multiple Date Conditions

I'm working with the following DataFrame:

id  slotTime    EDD EDD-10M
0   1000000101068957    2021-05-12  2021-12-26  2021-02-26
1   1000000100849718    2021-03-20  2021-04-05  2020-06-05
2   1000000100849718    2021-03-20  2021-04-05  2020-06-05
3   1000000100849718    2021-03-20  2021-04-05  2020-06-05
4   1000000100849718    2021-03-20  2021-04-05  2020-06-05

I would like to only keep the rows where the slotTime is between EDD-10M and EDD:

df['EDD-10M'] < df['slotTime'] < df['EDD']]

I have tried using the following method:

df.loc[df[df['slotTime'] < df['EDD']] & df[df['EDD-10M'] < df['slotTime']]]

However it yields the following error

TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Please Advise.

To replicate the above DataFrame use the below snippet:

import pandas as pd
from pandas import Timestamp

df = { 
  'id': {0: 1000000101068957,
  1: 1000000100849718,
  2: 1000000100849718,
  3: 1000000100849718,
  4: 1000000100849718,
  5: 1000000100849718,
  6: 1000000100849718,
  7: 1000000100849718,
  8: 1000000100849718,
  9: 1000000100849718},
  'EDD': {0: Timestamp('2021-12-26 00:00:00'),
  1: Timestamp('2021-04-05 00:00:00'),
  2: Timestamp('2021-04-05 00:00:00'),
  3: Timestamp('2021-04-05 00:00:00'),
  4: Timestamp('2021-04-05 00:00:00'),
  5: Timestamp('2021-04-05 00:00:00'),
  6: Timestamp('2021-04-05 00:00:00'),
  7: Timestamp('2021-04-05 00:00:00'),
  8: Timestamp('2021-04-05 00:00:00'),
  9: Timestamp('2021-04-05 00:00:00')},
 'EDD-10M': {0: Timestamp('2021-02-26 00:00:00'),
  1: Timestamp('2020-06-05 00:00:00'),
  2: Timestamp('2020-06-05 00:00:00'),
  3: Timestamp('2020-06-05 00:00:00'),
  4: Timestamp('2020-06-05 00:00:00'),
  5: Timestamp('2020-06-05 00:00:00'),
  6: Timestamp('2020-06-05 00:00:00'),
  7: Timestamp('2020-06-05 00:00:00'),
  8: Timestamp('2020-06-05 00:00:00'),
  9: Timestamp('2020-06-05 00:00:00')},
 'slotTime': {0: Timestamp('2021-05-12 00:00:00'),
  1: Timestamp('2021-03-20 00:00:00'),
  2: Timestamp('2021-03-20 00:00:00'),
  3: Timestamp('2021-03-20 00:00:00'),
  4: Timestamp('2021-03-20 00:00:00'),
  5: Timestamp('2021-03-20 00:00:00'),
  6: Timestamp('2021-03-20 00:00:00'),
  7: Timestamp('2021-03-20 00:00:00'),
  8: Timestamp('2021-03-20 00:00:00'),
  9: Timestamp('2021-03-20 00:00:00')}}

df = pd.DataFrame(df)

Upvotes: 2

Views: 934

Answers (3)

qualitaetsmuell
qualitaetsmuell

Reputation: 141

You can this by using query:

df.query("(slotTime < EDD) & (`EDD-10M` < slotTime)")

Upvotes: 1

sanzo213
sanzo213

Reputation: 139

you can use between() method someone already answered you or try like this

df.loc[(df['EDD-10M'] < df['slotTime']) & (df['slotTime'] < df['EDD'])]

you should use ( and ) multiple conditions

Upvotes: 1

Joran Beasley
Joran Beasley

Reputation: 113930

you just need to group your sides

df[(df['slotTime'] < df['EDD']) & (df['EDD-10M'] < df['slotTime'])]

otherwise order of operations tries to & things first and it all falls apart

alternatively you may wish to use the .between operator (assuming you have a datetime series

df[df['slotTime'].between(df['EDD'],df['EDD-10M'])]

Upvotes: 5

Related Questions