Jespar
Jespar

Reputation: 1026

Change year/quarter date format to previous period in python

I have a dataset containing monthly observations of a time-series. What I want to do is transform the datetime to year/quarter format and then extract the first value DATE[0] as the previous quarter. For example 2006-10-31 belongs to 4Q of 2006. But I want to change it to 2006Q3. For the extraction of the subsequent values I will just use the last value from each quarter. So, for 2006Q4 I will keep BBGN, SSD, and QQ4567 values only from DATE[2]. Similarly, for 2007Q1 I will keep only DATE[5] values, and so forth.

Original dataset:

               DATE                BBGN    SSD             QQ4567  
    0  2006-10-31 00:00:00        1.210  22.022          9726.550   
    1  2006-11-30 00:00:00        1.270  22.060          9891.008   
    2  2006-12-31 00:00:00        1.300  22.080         10055.466   
    3  2007-01-31 00:00:00        1.330  22.099         10219.924   
    4  2007-02-28 00:00:00        1.393  22.110         10350.406
    5  2007-03-31 00:00:00        1.440  22.125         10480.888 

After processing the DATE

       DATE          BBGN    SSD             QQ4567  
    0  2006Q3        1.210  22.022          9726.550   
    2  2006Q4        1.300  22.080         10055.466   
    5  2007Q1        1.440  22.125         10480.888 

The steps I have taken so far are:

Turn the values from the yyyy-mm-dd hh format to yyyyQQ format

DF['DATE'] = pd.to_datetime(DF['DATE']).dt.to_period('Q')

and I get this

    DATE          BBGN    SSD             QQ4567  
0  2006Q4        1.210  22.022          9726.550   
1  2006Q4        1.270  22.060          9891.008   
2  2006Q4        1.300  22.080         10055.466   
3  2007Q1        1.330  22.099         10219.924   
4  2007Q1        1.393  22.110         10350.406
5  2007Q1        1.440  22.125         10480.888

The next step is to extract the last values from each quarter. But because I always want to keep the first row I will exclude DATE[0] from the function.

quarterDF = DF.iloc[1:,].drop_duplicates(subset='DATE', keep='last')

Now, my question is how can I change the value in DATE[0] to always be the previous quarter. So, from 2006Q4 to be 2006Q3. Also, how this will work if DATE[0] is 2007Q1, can I change it to 2006Q4?

Upvotes: 1

Views: 1676

Answers (2)

Bharath M Shetty
Bharath M Shetty

Reputation: 30605

Here is a pivot_table approach

# Subtract the quarter from date save it in a column 
df['Q'] = df['DATE'] - pd.tseries.offsets.QuarterEnd()

#0   2006-09-30
#1   2006-09-30
#2   2006-09-30
#3   2006-12-31
#4   2006-12-31
#5   2006-12-31
#Name: Q, dtype: datetime64[ns]

# Drop and pivot for not including the columns 
ndf = df.drop(['DATE','Q'],1).pivot_table(index=pd.to_datetime(df['Q']).dt.to_period('Q'),aggfunc='last')

      BBGN     QQ4567     SSD
Qdate                          
2006Q3  1.30  10055.466  22.080
2006Q4  1.44  10480.888  22.125

Upvotes: 1

KPLauritzen
KPLauritzen

Reputation: 1869

My suggestion would be to create a new DATE column with a day 3 months in the past. Like this

import pandas as pd
df = pd.DataFrame()
df['Date'] = pd.to_datetime(['2006-10-31', '2007-01-31'])
one_quarter = pd.tseries.offsets.DateOffset(months=3)
df['Last_quarter'] = df.Date - one_quarter

This will give you

    Date    Last_quarter
0   2006-10-31  2006-07-31
1   2007-01-31  2006-10-31

Then you can do the same process as you described above on Last_quarter

Upvotes: 2

Related Questions