excelguy
excelguy

Reputation: 1624

Pandas Filter date

I have a dataframe like the following,

+-----------+-------+----------+--+--+
| Date      | OPP   | Result   |  |  |
+-----------+-------+----------+--+--+
| Sat 11/16 | @DAL  | L110-102 |  |  |
+-----------+-------+----------+--+--+
| Wed 11/13 | @POR  | W114-106 |  |  |
+-----------+-------+----------+--+--+
| Mon 11/11 | @LAC  | L98-88   |  |  |
+-----------+-------+----------+--+--+
| Sun 11/10 | @LAL  | W113-104 |  |  |
+-----------+-------+----------+--+--+
| Fri 11/8  | @NO   | W122-104 |  |  |
+-----------+-------+----------+--+--+
| Wed 11/6  | vsSAC | W124-120 |  |  |
+-----------+-------+----------+--+--+
| Sat 11/2  | @MIL  | L115-105 |  |  |
+-----------+-------+----------+--+--+

I am trying to filter for dates > _____ .

Ie here is what I have tried, but does not filter all the date greater then Sun 11/10

d1 = d1[(d1['Date'] > 'Sun 11/10')]

Update

My column now looks like this, I need to be able to filter New_Date > _____ and exclude "NaT". Trying d1[(d1['New_Date'] > '2019-11-01')] but not working.

0    2019-11-20
1    2019-11-18
2    2019-11-16
3    2019-11-13
4    2019-11-11
5    2019-11-10
6    2019-11-08
7    2019-11-06
8    2019-11-02
9    2019-11-01
10   2019-10-30
11   2019-10-28
12   2019-10-26
13   2019-10-01
14          NaT
15          NaT
16   2019-10-18
17   2019-10-13
18   2019-10-10
19   2019-10-08
20          NaT
21          NaT

Any help would be appreciated.

Upvotes: 1

Views: 277

Answers (3)

exan
exan

Reputation: 3925

From what I understand, are you trying to filter dates just to remove NaTs? Then that's not how you do it. Infact, Pandas have several functions that can check for NaTs e.g. pandas.isnull(), pandas.notna(), pandas.DataFrame.notna(), etc. :

Here's a quick example as how to use it

    >>> import pandas as pd
    >>> pd.isnull(np.datetime64('NaT'))
    True
    >>> d1[d1['New_Date'].notna()]

Upvotes: 0

oppressionslayer
oppressionslayer

Reputation: 7204

Are you open to changing the format to another, or having a separate column with a date that is searchable by your critieria? I wrote this solution showing that route, maybe it will help you with ideas, and i'm open to suggestions for changes:

newdf = pd.read_csv('testdata2.csv', parse_dates=["Date"], date_parser=lambda x: pd.to_datetime(x, format="%a %m/%d"), 
index_col="Date") 

newdf = newdf.reset_index()   
newdf['Date'] = newdf['Date'].mask(newdf['Date'].dt.year == 1900, newdf['Date'] + pd.offsets.DateOffset(year=2019))

newdf[(newdf['Date'] > '2019-11/10')]
#Out[63]: 
#        Date   OPP    Result
#0 2019-11-16  @DAL  L110-102
#1 2019-11-13  @POR  W114-106
#2 2019-11-11  @LAC    L98-88

Upvotes: 1

Georgina Skibinski
Georgina Skibinski

Reputation: 13377

First you need to convert your date into proper datetime object, providing proper input format (which I assumed is <weekday> <month>/<day> - you can tweak it as per datetime documentation: https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior ). You can do it in a following way:

df["Date_2"]=pd.to_datetime(df["Date"].astype(str), format="%a %m/%d")

Then use analogical transformation on the filter criteria using python datetime.strptime() method:

from datetime import datetime

df.loc[df["Date_2"]>datetime.strptime("Thu 11/12", "%a %m/%d")]

So the whole example:

import pandas as pd
from datetime import datetime

df=pd.DataFrame({"Date": ["Sat 11/16", "Fri 11/8", "Wed 11/13"], "x": [4,3,7]})


df["Date_2"]=pd.to_datetime(df["Date"].astype(str), format="%a %m/%d")
print("\nExample filtered: ")
print(df.loc[df["Date_2"]>datetime.strptime("Thu 11/12", "%a %m/%d")])
print("\nThe whole thing:")
print(df)

And output:

Example filtered:
        Date  x     Date_2
0  Sat 11/16  4 1900-11-16
2  Wed 11/13  7 1900-11-13

The whole thing:
        Date  x     Date_2
0  Sat 11/16  4 1900-11-16
1   Fri 11/8  3 1900-11-08
2  Wed 11/13  7 1900-11-13

(Since no year is provided it's assumed to be 1900)

Upvotes: 1

Related Questions