Reputation: 399
I have the following pandas.Dataframe
:
Date-Time Week Month
2017-07-29 16:05:00 30 Jul-17
2017-07-29 23:58:00 30 Jul-17
2017-07-30 04:46:00 31 Jul-17
2017-07-30 10:57:00 31 Jul-17
2017-07-31 09:36:00 31 Jul-17
2017-07-31 10:46:00 31 Jul-17
2017-08-01 09:06:00 31 Aug-17
2017-08-01 09:13:00 31 Aug-17
2017-08-02 16:02:00 31 Aug-17
2017-08-02 16:06:00 31 Aug-17
2017-08-02 16:55:00 31 Aug-17
2017-08-02 18:57:00 31 Aug-17
2017-08-02 21:56:00 31 Aug-17
2017-08-03 08:49:00 31 Aug-17
2017-08-03 14:16:00 31 Aug-17
2017-08-04 09:09:00 31 Aug-17
2017-08-04 10:08:00 31 Aug-17
2017-08-05 11:24:00 31 Aug-17
2017-08-05 12:42:00 31 Aug-17
2017-08-06 03:47:00 32 Aug-17
2017-08-06 03:55:00 32 Aug-17
2017-08-07 08:57:00 32 Aug-17
2017-08-07 11:35:00 32 Aug-17
Is there a way I can normalize the month value for each week to be the value of the month to which the mid-week day (Wednesday) belongs?
In the above dataframe, Wednesday of week 31 lies in August, so the month column for week 31 should be as follows:
Date-Time Week Month
2017-07-29 16:05:00 30 Jul-17
2017-07-29 23:58:00 30 Jul-17
2017-07-30 04:46:00 31 Aug-17
2017-07-30 10:57:00 31 Aug-17
2017-07-31 09:36:00 31 Aug-17
2017-07-31 10:46:00 31 Aug-17
2017-08-01 09:06:00 31 Aug-17
2017-08-01 09:13:00 31 Aug-17
2017-08-02 16:02:00 31 Aug-17
2017-08-02 16:06:00 31 Aug-17
2017-08-02 16:55:00 31 Aug-17
2017-08-02 18:57:00 31 Aug-17
2017-08-02 21:56:00 31 Aug-17
2017-08-03 08:49:00 31 Aug-17
2017-08-03 14:16:00 31 Aug-17
2017-08-04 09:09:00 31 Aug-17
2017-08-04 10:08:00 31 Aug-17
2017-08-05 11:24:00 31 Aug-17
2017-08-05 12:42:00 31 Aug-17
2017-08-06 03:47:00 32 Aug-17
2017-08-06 03:55:00 32 Aug-17
2017-08-07 08:57:00 32 Aug-17
2017-08-07 11:35:00 32 Aug-17
Please note that, both the Week and Month column of the dataframe are derived from the Date-Time column of the same dataframe.
Upvotes: 1
Views: 68
Reputation: 49784
You can use a timedelta to move your Wednesday back to the beginning of the week and then use pandas.Dateoffset
to clamp a date to a day of the week like:
df['DWY'] = df['Date-Time'] - pd.Timedelta(3, unit='d') - pd.DateOffset(weekday=2)
This is approximately fours time as fast as the string manipulations in your answer for long vectors.
import pandas as pd
data = u"""
date month
2016-12-31 1
2017-01-01 1
2017-01-28 1
2017-01-29 1
2017-02-25 1
2017-02-26 1
2017-04-01 1
2017-04-02 1
2017-04-29 1
2017-04-30 1
2017-06-03 1
2017-06-04 1
2017-07-01 1
2017-07-02 1
2017-07-29 1
2017-07-30 1
2017-09-02 1
2017-09-03 1
2017-09-30 1
2017-10-01 1
2017-10-28 1
2017-10-29 1
2017-12-02 1
2017-12-03 1
2017-12-30 1
2017-12-31 1"""
df = pd.read_fwf(StringIO(data), header=1)
df['date'] = pd.to_datetime(df['date'])
df['weds-week'] = df['date'] - pd.Timedelta(3, unit='d') - pd.DateOffset(weekday=2)
df['month'] = df['weds-week'].dt.month
print(df)
date month weds-week
0 2016-12-31 12 2016-12-28
1 2017-01-01 1 2017-01-04
2 2017-01-28 1 2017-01-25
3 2017-01-29 2 2017-02-01
4 2017-02-25 2 2017-02-22
5 2017-02-26 3 2017-03-01
6 2017-04-01 3 2017-03-29
7 2017-04-02 4 2017-04-05
8 2017-04-29 4 2017-04-26
9 2017-04-30 5 2017-05-03
10 2017-06-03 5 2017-05-31
11 2017-06-04 6 2017-06-07
12 2017-07-01 6 2017-06-28
13 2017-07-02 7 2017-07-05
14 2017-07-29 7 2017-07-26
15 2017-07-30 8 2017-08-02
16 2017-09-02 8 2017-08-30
17 2017-09-03 9 2017-09-06
18 2017-09-30 9 2017-09-27
19 2017-10-01 10 2017-10-04
20 2017-10-28 10 2017-10-25
21 2017-10-29 11 2017-11-01
22 2017-12-02 11 2017-11-29
23 2017-12-03 12 2017-12-06
24 2017-12-30 12 2017-12-27
25 2017-12-31 1 2018-01-03
Upvotes: 1
Reputation: 399
It can be done as follows:
df['DWY'] = df['Date-Time'].dt.strftime('%U-%Y')
df['DWY'] = '3-' + df['DWY'].astype(str)
for i in df.index:
df['DWY'][i] = dt.strptime(df['DWY'][i],'%w-%U-%Y')
However if there's a more efficient way of doing this, please let me know.
EDIT: The for loop can be replaced by this:
df['DWY'] = pd.to_datetime(df['DWY'],format='%w-%U-%Y')
Upvotes: 1