john_mon
john_mon

Reputation: 517

Setting the Frequency of a timeseries Pandas Dataframe with Two index Columns

I have a dataframe, "df", which has a datetime index and another index column called "Location":

                        V1   V2  
Date       Location             
2001-01-01  1           0.5  0.7
            2           0.6  0.5
2001-01-02  3           0.8  0.2
            4           0.8  0.2
2001-01-03  5           0.2  0.4
            6           0.2  0.5
2001-01-04  7           0.2  0.3
            8           0.8  0.7

As you can see, the dataframe has multiple observations under the same date.

To be able to use some statistical packages, I have to set the frequency of the dataframe to "days" using this method:

df = df.asfreq('d')

However, the dataframe has two index columns; one datetime and another which is not. When I tried to set frequency using the approach which is in the captioned codeblock, I got this error:

TypeError: Cannot convert input [(Timestamp('2002-07-23 00:00:00+0000', tz='UTC'), '1')] of type to Timestamp

If I try to set just the date as the index column, I end up with the case of the same date appearing multiple times in the frame. In short; Pandas interprets these repeated instances as duplicates.

How would you resolve this issue?

Upvotes: 2

Views: 1106

Answers (1)

jezrael
jezrael

Reputation: 862681

There is MultiIndex, so possible solution is reshape by DataFrame.unstack first for DatetimeIndex and then reshape back by DataFrame.stack:

df = df.unstack().asfreq('d').stack()
print (df)
                      V1   V2
Date       Location          
2001-01-01 1         0.5  0.7
           2         0.6  0.5
2001-01-02 3         0.8  0.2
           4         0.8  0.2
2001-01-03 5         0.2  0.4
           6         0.2  0.5
2001-01-04 7         0.2  0.3
           8         0.8  0.7

Upvotes: 2

Related Questions