ikel
ikel

Reputation: 1978

always filter to last month in dataframe

i have a dataframe that has a column named "date found"

date found
11/3/2019 7:39
11/3/2019 7:46
11/3/2019 11:23
11/3/2019 7:39
11/3/2019 11:32
10/2/2019 4:33
10/3/2019 8:30
10/3/2019 8:02
09/3/2019 10:43
09/3/2019 10:33
08/3/2019 11:24

what is the best way to filter date so that the outcome dataframe always for the last month, for example, today is one of days in Nov, then the desired filtered dataframe is only for Oct. if today is in Oct (any day in Oct), then filtered dataframe will be one with only Sept data.

Upvotes: 0

Views: 1242

Answers (2)

Juan Kania-Morales
Juan Kania-Morales

Reputation: 588

Ticked answer is okay, but below example illustrates that about 30-times quicker is solution related to datetime format:

# create dataframe
df = \
pd.DataFrame(
    {
        'date found': ['11/3/2019 7:39', '11/3/2019 7:46', '11/3/2019 11:23', '11/3/2019 7:39', '11/3/2019 11:32', '10/2/2019 4:33', '10/3/2019 8:30', '10/3/2019 8:02', '09/3/2019 10:43', '09/3/2019 10:33', '08/3/2019 11:24']*100000,
        'values': [i for i in range(11)]*100000
    }
)

# impose datetime64 format on column 'date found'
df['date found'] = pd.to_datetime(df['date found'])

Okay solution:

%timeit
s1=pd.to_datetime(df['date found']).dt.strftime('%Y-%m')
s2=(pd.to_datetime('today')-pd.DateOffset(months=1)).strftime('%Y-%m')
df[s1==s2]

10.4 s ± 416 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

More time efficient:

%%timeit
mask = \
(df['date found'].dt.year == (pd.to_datetime('today')-pd.DateOffset(months=1)).year) & \
(df['date found'].dt.month == (pd.to_datetime('today')-pd.DateOffset(months=1)).month)
df[mask]

294 ms ± 35.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Upvotes: 1

BENY
BENY

Reputation: 323226

IIUC DateOffset

s1=pd.to_datetime(df['date found']).dt.strftime('%Y-%m')
s2=(pd.to_datetime('today')-pd.DateOffset(months=1)).strftime('%Y-%m')
df[s1==s2]
Out[249]: 
       date found
5  10/2/2019 4:33
6  10/3/2019 8:30
7  10/3/2019 8:02

Upvotes: 4

Related Questions