Chrissie M.
Chrissie M.

Reputation: 121

filtering date column in python

I'm new to python and I'm facing the following problem. I have a dataframe composed of 2 columns, one of them is date (datetime64[ns]). I want to keep all records within the last 12 months. My code is the following:

today=start_time.date()
last_year = today + relativedelta(months = -12)
new_df = df[pd.to_datetime(df.mydate) >= last_year]

when I run it I get the following message: TypeError: type object 2017-06-05

Any ideas? last_year seems to bring me the date that I want in the following format: 2017-06-05

Upvotes: 0

Views: 702

Answers (3)

Simeon Ikudabo
Simeon Ikudabo

Reputation: 2190

Create a time delta object in pandas to increment the date (12 months). Call pandas.Timstamp('now') to get the current date. And then create a date_range. Here is an example for getting monthly data for 12 months.

import pandas as pd
import datetime
list_1 = [i for i in range(0, 12)]
list_2 = [i for i in range(13, 25)]
list_3 = [i for i in range(26, 38)]

data_frame = pd.DataFrame({'A': list_1, 'B': list_2, 'C':list_3}, pd.date_range(pd.Timestamp('now'), pd.Timestamp('now') + pd.Timedelta           (weeks=53), freq='M'))

We create a timestamp for the current date and enter that as our start date. Then we create a timedelta to increment that date by 53 weeks (or 52 if you'd like) which gets us 12 months of data. Below is the output:

                              A   B   C
 2018-06-30 05:05:21.335625   0  13  26
 2018-07-31 05:05:21.335625   1  14  27
 2018-08-31 05:05:21.335625   2  15  28
 2018-09-30 05:05:21.335625   3  16  29
 2018-10-31 05:05:21.335625   4  17  30
 2018-11-30 05:05:21.335625   5  18  31
 2018-12-31 05:05:21.335625   6  19  32
 2019-01-31 05:05:21.335625   7  20  33
 2019-02-28 05:05:21.335625   8  21  34
 2019-03-31 05:05:21.335625   9  22  35
 2019-04-30 05:05:21.335625  10  23  36
 2019-05-31 05:05:21.335625  11  24  37

Upvotes: 1

jpp
jpp

Reputation: 164613

You can use pandas functionality with datetime objects. The syntax is often more intuitive and obviates the need for additional imports.

last_year = pd.to_datetime('today') + pd.DateOffset(years=-1)

new_df = df[pd.to_datetime(df.mydate) >= last_year]

As such, we would need to see all your code to be sure of the reason behind your error; for example, how is start_time defined?

Upvotes: 0

Ernest S Kirubakaran
Ernest S Kirubakaran

Reputation: 1564

Try

today = datetime.datetime.now()

Upvotes: 0

Related Questions