Reputation: 37
I have a dataframe that contains a year-quarter column df.period
and I want to convert that column into the DatetimeIndex
with this code.
# path is my filepath
df = pd.read_excel(path, skiprows =[1,2,3,4]) # unread first four rows
idx = pd.to_datetime(df.period)
df = df.set_index(pd.DatetimeIndex(idx))
The code works well for me; however, it returns the first date of each quarter on PeriodIdx
.
+------------+--------+
| PeriodIdx | Period |
+------------+--------+
| 2000-01-01 | 2000Q1 |
| 2000-04-01 | 2000Q2 |
| ... | ... |
+------------+--------+
But my purposed output is to get the end date of each quarter as shown below.
+------------+--------+
| PeriodIdx | Period |
+------------+--------+
| 2000-03-31 | 2000Q1 |
| 2000-06-30 | 2000Q2 |
| ... | ... |
+------------+--------+
Refer to the solution in R Forcing end of quarter date for as.Date(as.yearqtr()) , is there any Pandas function for the last date of the quarter?
Upvotes: 3
Views: 648
Reputation: 150745
A solution is to add offset:
idx = pd.DatetimeIndex(pd.to_datetime(df.period) + pd.offsets.MonthEnd(3),
name='PeriodIdx')
df.set_index(idx)
Output:
Period
PeriodIdx
2000-03-31 2000Q1
2000-06-30 2000Q2
Upvotes: 1