Reputation: 365
I have the following dataframe df
:
date USDBRL
0 1994-01-31 458.6600
1 1994-02-28 637.4500
2 1994-03-30 913.3450
3 1994-04-29 1302.2800
4 1994-05-31 1875.2700
.. ... ...
317 2020-06-30 5.4760
318 2020-07-31 5.2033
319 2020-08-31 5.4713
320 2020-09-30 5.6407
321 2020-10-09 5.5393
I would like to change all date
values to the first day of the month.
I tried this:
df['date'] = datetime(df['date'].dt.year, df['date'].dt.month, 1)
but I'm getting syntax error:
TypeError: cannot convert the series to <class 'int'>
Upvotes: 2
Views: 7517
Reputation: 62383
pandas.offsets.MonthBegin
to set the day to the beginning of the month.method 1
sets the day to the beginning of the month, but if the day is already the 1st, then the date is set to the first of the previous month.method 2
first sets the date to the end of the month, then to the first of the month.import pandas as pd
# sample data
df = pd.DataFrame({'date': [pd.Timestamp('1994-01-31 00:00:00'), pd.Timestamp('1994-02-28 00:00:00'), pd.Timestamp('1994-03-30 00:00:00'), pd.Timestamp('1994-04-29 00:00:00'), pd.Timestamp('1994-05-31 00:00:00'), pd.Timestamp('2010-01-01 00:00:00'), pd.Timestamp('2010-01-31 00:00:00'), pd.Timestamp('2010-03-02 00:00:00'), pd.Timestamp('2010-04-01 00:00:00'), pd.Timestamp('2010-05-01 00:00:00'), pd.Timestamp('2010-05-31 00:00:00'), pd.Timestamp('2020-06-30 00:00:00'), pd.Timestamp('2020-07-31 00:00:00'), pd.Timestamp('2020-08-31 00:00:00'), pd.Timestamp('2020-09-30 00:00:00'), pd.Timestamp('2020-10-09 00:00:00')]})
# method 1
df['m1'] = df.date - pd.offsets.MonthBegin(1)
# method 2
df['m2'] = df.date - pd.offsets.MonthEnd(0) - pd.offsets.MonthBegin(1)
method 1
, if the day of the month is already the first.
method 2
, which accounts for days on the 1st.Upvotes: 2
Reputation: 59519
pd.to_datetime
can assemble a datetime from a DataFrame with standard column names 'year'
, 'month'
and 'day
.
import pandas as pd
df = pd.DataFrame({'date': pd.date_range('2010-01-01', freq='30D', periods=6)})
df['new_date'] = pd.to_datetime(pd.DataFrame({'day': 1,
'month': df['date'].dt.month,
'year': df['date'].dt.year},
index=df.index))
date new_date
0 2010-01-01 2010-01-01
1 2010-01-31 2010-01-01
2 2010-03-02 2010-03-01
3 2010-04-01 2010-04-01
4 2010-05-01 2010-05-01
5 2010-05-31 2010-05-01
Upvotes: 2