kevin
kevin

Reputation: 2014

Extract weeks from datetime (Python Pandas)

I have a dataframe:

    time            year    month
0   12/28/2013 0:17 2013    12
1   12/28/2013 0:20 2013    12
2   12/28/2013 0:26 2013    12
3   12/29/2013 0:20 2013    12
4   12/29/2013 0:26 2013    12
5   12/30/2013 0:31 2013    12
6   12/30/2013 0:31 2013    12
7   12/31/2013 0:17 2013    12
8   12/31/2013 0:20 2013    12
9   12/31/2013 0:26 2013    12
10  1/1/2014 4:30   2014    1
11  1/1/2014 4:34   2014    1
12  1/1/2014 4:37   2014    1
13  1/2/2014 4:30   2014    1
14  1/2/2014 5:30   2014    1
15  1/3/2014 4:30   2014    1
16  1/3/2014 4:34   2014    1
17  1/3/2014 4:37   2014    1
18  1/4/2014 4:30   2014    1
19  1/4/2014 4:34   2014    1
20  1/4/2014 4:37   2014    1

I use the following code to extract the week information:

df['week'] = df['time'].dt.week

This makes the dataframe as following:

    time               year month   week
0   2013-12-28 00:17:00 2013    12  52
1   2013-12-28 00:20:00 2013    12  52
2   2013-12-28 00:26:00 2013    12  52
3   2013-12-29 00:20:00 2013    12  52
4   2013-12-29 00:26:00 2013    12  52
5   2013-12-30 00:31:00 2013    12  1
6   2013-12-30 00:31:00 2013    12  1
7   2013-12-31 00:17:00 2013    12  1
8   2013-12-31 00:20:00 2013    12  1
9   2013-12-31 00:26:00 2013    12  1
10  2014-01-01 04:30:00 2014    1   1
11  2014-01-01 04:34:00 2014    1   1
12  2014-01-01 04:37:00 2014    1   1
13  2014-01-02 04:30:00 2014    1   1
14  2014-01-02 05:30:00 2014    1   1
15  2014-01-03 04:30:00 2014    1   1
16  2014-01-03 04:34:00 2014    1   1
17  2014-01-03 04:37:00 2014    1   1
18  2014-01-04 04:30:00 2014    1   1
19  2014-01-04 04:34:00 2014    1   1
20  2014-01-04 04:37:00 2014    1   1

I would like to create another column showing year-week (e.g., 2013-52, 2014-1). The problem is when I combine two columns (year, week) in rows 5 through 9, the result is 2013-1 saying the first week of 2013. This is not correct. Is there any solution for this issue?

Upvotes: 5

Views: 5449

Answers (2)

piRSquared
piRSquared

Reputation: 294218

Use dt.strftime
reference http://strftime.org/

df.time.dt.strftime('%Y-%W')

0     2013-51
1     2013-51
2     2013-51
3     2013-51
4     2013-51
5     2013-52
6     2013-52
7     2013-52
8     2013-52
9     2013-52
10    2014-00
11    2014-00
12    2014-00
13    2014-00
14    2014-00
15    2014-00
16    2014-00
17    2014-00
18    2014-00
19    2014-00
20    2014-00
Name: time, dtype: object

Upvotes: 6

Filip Kilibarda
Filip Kilibarda

Reputation: 2668

As @TrigonaMinima pointed out, the first week of the year as defined by ISO 8601 (which dt.week follows):

It is the first week with a majority (4 or more) of its days in January

In your case, week = 1 has 2 days in December and the rest in January, thus fitting the definition of the first week.

Upvotes: 0

Related Questions