Charles R
Charles R

Reputation: 1661

Get the first day of the week for a Pandas series

I have the following df :

import pandas as pd
from datetime import datetime, timedelta

df = pd.DataFrame([
        ["A", "2018-08-03"],
        ["B", "2018-08-20"]
])
df.columns = ["Item", "Date"]

I want to get the first day of the week for every line of my df. I tried to do this :

df['Date'] =  pd.to_datetime(df['Date'], format='%Y-%m-%d')
df["Day_of_Week"] = df.Date.dt.weekday

df["First_day_of_the_week"] = df.Date - timedelta(days=df.Day_of_Week)

But I got that error message :

TypeError: unsupported type for timedelta days component: Series

How can I get the first day of the week for a Series ? My expected result is that :

Upvotes: 16

Views: 19518

Answers (7)

Manuel Montoya
Manuel Montoya

Reputation: 1436

An alternative to jpp answer using only pandas:

df_plot['first_date'] = df_plot['date'] - pd.to_timedelta(df_plot['date'].dt.weekday, unit='days')

Upvotes: 0

Fernando Wittmann
Fernando Wittmann

Reputation: 2537

Here's a solution that doesn't require timedelta or lambda functions with subtractions:

df['Date'].dt.to_period('W').dt.to_timestamp()

Upvotes: 6

ilia timofeev
ilia timofeev

Reputation: 1119

pandas version

df = pd.DataFrame({
    'Item': ['A', 'B'],
    'Date': ['2018-08-03', '2018-08-20']
})

df['Date'] = pd.to_datetime(df.Date) #Use pd.Timestamp
df.Date - pd.TimedeltaIndex(df.Date.dt.dayofweek,unit='d') 

Output:

0   2018-07-30
1   2018-08-20
dtype: datetime64[ns]

Docs on used functions: pd.TimedeltaIndex, pd.to_datetime

Working with date and time: Time Series / Date functionality

Upvotes: 2

Brad Solomon
Brad Solomon

Reputation: 40878

You can stay in Pandas and use its DateOffset objects:

>>> from pandas.tseries.offsets import Week

>>> df.Date.where(df.Date.dt.weekday == 0, df.Date - Week(weekday=0))
0   2018-07-30
1   2018-08-20
Name: Date, dtype: datetime64[ns]

The trick being that you need to not do the subtraction where the weekday is already Monday (weekday == 0). This says, "in cases where weekday is already zero, do nothing; else, return Monday of that week."

Upvotes: 4

jpp
jpp

Reputation: 164623

A vectorised solution is possible with NumPy:

df['First_day'] = df['Date'] - df['Date'].dt.weekday * np.timedelta64(1, 'D')

print(df)

  Item       Date  First_day
0    A 2018-08-03 2018-07-30
1    B 2018-08-20 2018-08-20

Upvotes: 20

Spinor8
Spinor8

Reputation: 1607

Leave out your 'Day of week" calculation and do this.

df["First_day_of_the_week"] = df['Date'].apply(lambda x: (x - timedelta(days=x.dayofweek)))
print(df)

giving

  Item       Date First_day_of_the_week
0    A 2018-08-03            2018-07-30
1    B 2018-08-20            2018-08-20

Upvotes: 4

kosnik
kosnik

Reputation: 2424

Unfortunately timedelta doesn't support a vectorized form so I would go for an apply

df["First_day_of_the_week"] = df.apply(lambda x: x['Date'] - timedelta(days=x['Day_of_Week']), axis=1)

EDIT

timedelta doesn't support vectorized arguments but can be multiplied by a vector :)

df["First_day_of_the_week"] = df.Date - df.Day_of_Week * timedelta(days=1)

Upvotes: 8

Related Questions