Reputation: 213
I downloaded a dataset from the internet which contains the data related to daily transactions in the following format [dummy values mentioned below] :
ID Date Item1 Item2 Item3 Item4 Item5
0 2017-08-28 1234 5678 91011 1213 1415
1 2017-07-27 1234 5678 91011 1213 1415
2 2017-06-26 1234 5678 91011 1213 1415
3 2017-05-25 1234 5678 91011 1213 1415
How do I use the date column to get the values for the transactions that happened in the last week , month and the last 3 months ?
I used this to format the date in the pandas format:
df['week'] = pd.to_datetime(df['week'])
And tried to get the values in the last week using this :
range_max = df['week'].max()
range_min = range_max - dt.timedelta(days=7)
# take slice with final week of data
sliced_df = df[(df['week'] >= range_min) & (df['week'] <= range_max)]
This works for retrieving last week's data , but how do I retrieve values for , say, a specific month or week ?
Upvotes: 1
Views: 148
Reputation: 18906
Here are some examples:
import pandas as pd
import datetime
import io
string = u"""ID,Date,Item1,Item2,Item3,Item4,Item5
0,2017-08-28,1234,5678,91011,1213,1415
1,2017-07-27,1234,5678,91011,1213,1415
2,2017-06-26,1234,5678,91011,1213,1415
3,2017-05-25,1234,5678,91011,1213,1415"""
df = pd.read_csv(io.StringIO(string))
df["Date"] = pd.to_datetime(df["Date"])
now = datetime.datetime.now().date()
mask1 = (df.Date < now) & (df.Date > now - datetime.timedelta(days=90)) # 90 days
mask2 = (df.Date.dt.month == 7) # specific month
mask3 = (df.Date.dt.week == 21) # specific week
mask4 = (df.Date.dt.month > now.month-3) # 3 months back
mask5 = (df.Date.dt.strftime("%B").isin(["May","July"])) # stfrtime and you can use this, quite clear!
mask6 = (df.Date.dt.strftime("%A").isin(["Monday","Tuesday"])) # stfrtime and you can use this, quite clear!
# Lastly
df[mask1]
#df[mask2]
#df[mask3]
...
It really boils down to what you exactly are looking for. Updated with months and weeksdays too.
Upvotes: 1