Reputation: 69
I'm looking to add a %Y%m%d date column to my dataframe using a period column that has integers 1-32, which represent monthly data points starting at a defined environment variable "odate" (e.g. if odate=20190531 then period 1 should be 20190531, period 2 should be 20190630, etc.)
I tried defining a dictionary with the number of periods in the column as the keys and the value being odate + MonthEnd(period -1)
This works fine and well; however, I want to improve the code to be flexible given changes in the number of periods.
Is there a function that will allow me to fill the date columns with the odate in period 1 and then subsequent month ends for subsequent periods?
example dataset:
odate=20190531
period value
1 5.5
2 5
4 6.2
3 5
5 40
11 5
desired dataset:
odate=20190531
period value date
1 5.5 2019-05-31
2 5 2019-06-30
4 6.2 2019-08-31
3 5 2019-07-31
5 40 2019-09-30
11 5 2020-03-31
Upvotes: 0
Views: 408
Reputation: 4792
You can use pd.date_range()
:
pd.date_range(start = '2019-05-31', periods = 100,freq='M')
You can change total periods depending on what you need, the freq='M'
means a Month-End frequency
Here is a list of Offset Aliases you can for freq
parameter.
If you just want to add or subtract some period to a date you can use pd.DataOffset
:
odate = pd.Timestamp('20191031')
odate
>> Timestamp('2019-10-31 00:00:00')
odate - pd.DateOffset(months=4)
>> Timestamp('2019-06-30 00:00:00')
odate + pd.DateOffset(months=4)
>> Timestamp('2020-02-29 00:00:00')
To add given the period column to Month Ends:
odate = pd.Timestamp('20190531')
df['date'] = df.period.apply(lambda x: odate + pd.offsets.MonthEnd(x-1))
df
period value date
0 1 5.5 2019-05-31
1 2 5.0 2019-06-30
2 4 6.2 2019-08-31
3 3 5.0 2019-07-31
4 5 40.0 2019-09-30
5 11 5.0 2020-03-31
To improve performance use list-comprehension
:
df['date'] = [odate + pd.offsets.MonthEnd(period-1) for period in df.period]
Upvotes: 2