Daniel Arges
Daniel Arges

Reputation: 365

How to change the day in datetime, to the first day of the current month

I have the following dataframe df:

          date     USDBRL
0   1994-01-31   458.6600
1   1994-02-28   637.4500
2   1994-03-30   913.3450
3   1994-04-29  1302.2800
4   1994-05-31  1875.2700
..         ...        ...
317 2020-06-30     5.4760
318 2020-07-31     5.2033
319 2020-08-31     5.4713
320 2020-09-30     5.6407
321 2020-10-09     5.5393

I would like to change all date values to the first day of the month.

I tried this:

df['date'] = datetime(df['date'].dt.year, df['date'].dt.month, 1)

but I'm getting syntax error:

TypeError: cannot convert the series to <class 'int'>

Upvotes: 2

Views: 7517

Answers (2)

Trenton McKinney
Trenton McKinney

Reputation: 62383

import pandas as pd

# sample data
df = pd.DataFrame({'date': [pd.Timestamp('1994-01-31 00:00:00'), pd.Timestamp('1994-02-28 00:00:00'), pd.Timestamp('1994-03-30 00:00:00'), pd.Timestamp('1994-04-29 00:00:00'), pd.Timestamp('1994-05-31 00:00:00'), pd.Timestamp('2010-01-01 00:00:00'), pd.Timestamp('2010-01-31 00:00:00'), pd.Timestamp('2010-03-02 00:00:00'), pd.Timestamp('2010-04-01 00:00:00'), pd.Timestamp('2010-05-01 00:00:00'), pd.Timestamp('2010-05-31 00:00:00'), pd.Timestamp('2020-06-30 00:00:00'), pd.Timestamp('2020-07-31 00:00:00'), pd.Timestamp('2020-08-31 00:00:00'), pd.Timestamp('2020-09-30 00:00:00'), pd.Timestamp('2020-10-09 00:00:00')]})

# method 1
df['m1'] = df.date - pd.offsets.MonthBegin(1)

# method 2
df['m2'] = df.date - pd.offsets.MonthEnd(0) - pd.offsets.MonthBegin(1)

enter image description here

  • As noted by ALollz in a comment, there can be issues with method 1, if the day of the month is already the first.
    • At his suggestion, is a more robust solution, method 2, which accounts for days on the 1st.

Upvotes: 2

ALollz
ALollz

Reputation: 59519

pd.to_datetime can assemble a datetime from a DataFrame with standard column names 'year', 'month' and 'day.

import pandas as pd
df = pd.DataFrame({'date': pd.date_range('2010-01-01', freq='30D', periods=6)})

df['new_date'] = pd.to_datetime(pd.DataFrame({'day': 1, 
                                              'month': df['date'].dt.month, 
                                              'year': df['date'].dt.year},
                                              index=df.index))

        date   new_date
0 2010-01-01 2010-01-01
1 2010-01-31 2010-01-01
2 2010-03-02 2010-03-01
3 2010-04-01 2010-04-01
4 2010-05-01 2010-05-01
5 2010-05-31 2010-05-01

Upvotes: 2

Related Questions