Ben
Ben

Reputation: 69

Add Months to Data Frame using a period column

I'm looking to add a %Y%m%d date column to my dataframe using a period column that has integers 1-32, which represent monthly data points starting at a defined environment variable "odate" (e.g. if odate=20190531 then period 1 should be 20190531, period 2 should be 20190630, etc.)

I tried defining a dictionary with the number of periods in the column as the keys and the value being odate + MonthEnd(period -1)

This works fine and well; however, I want to improve the code to be flexible given changes in the number of periods.

Is there a function that will allow me to fill the date columns with the odate in period 1 and then subsequent month ends for subsequent periods?

example dataset:

odate=20190531

period value
1      5.5
2      5
4      6.2
3      5
5      40
11     5

desired dataset:

odate=20190531

period value date
1      5.5   2019-05-31
2      5     2019-06-30
4      6.2   2019-08-31
3      5     2019-07-31
5      40    2019-09-30
11     5     2020-03-31

Upvotes: 0

Views: 408

Answers (1)

Mohit Motwani
Mohit Motwani

Reputation: 4792

You can use pd.date_range():

pd.date_range(start = '2019-05-31', periods = 100,freq='M')

You can change total periods depending on what you need, the freq='M' means a Month-End frequency

Here is a list of Offset Aliases you can for freq parameter.

If you just want to add or subtract some period to a date you can use pd.DataOffset:

odate = pd.Timestamp('20191031')
odate
>> Timestamp('2019-10-31 00:00:00')

odate - pd.DateOffset(months=4)
>> Timestamp('2019-06-30 00:00:00')

odate + pd.DateOffset(months=4)
>> Timestamp('2020-02-29 00:00:00')

To add given the period column to Month Ends:

odate = pd.Timestamp('20190531')
df['date'] = df.period.apply(lambda x: odate + pd.offsets.MonthEnd(x-1))
df
 period value   date
0   1   5.5     2019-05-31
1   2   5.0     2019-06-30
2   4   6.2     2019-08-31
3   3   5.0     2019-07-31
4   5   40.0    2019-09-30
5   11  5.0     2020-03-31

To improve performance use list-comprehension:

df['date'] = [odate + pd.offsets.MonthEnd(period-1) for period in df.period]

Upvotes: 2

Related Questions