sandrosil
sandrosil

Reputation: 553

pandas - Resampling datetime index and extending to end of the month

I am trying to resample a datetime index into hourly data. I also want the resampling until the end of the month.

So given the following df:

data = np.arange(6).reshape(3,2)
rng = ['Jan-2016', 'Feb-2016', 'Mar-2016']
df = pd.DataFrame(data, index=rng)
df.index = pd.to_datetime(df.index)

            0  1
2016-01-01  0  1
2016-02-01  2  3
2016-03-01  4  5

I know I can resample this into an hourly index by: df = df.resample('H').ffill() However, when I call the df it gets cut at 2016-03-01. I am essentially making the index run from 1/1/2016 to 3/31/2016 with an hourly granularity.

How can I extend this to the end of the month 2015-03-31 given that the last index is the beginning of the month.

Upvotes: 5

Views: 4103

Answers (2)

El Jesús
El Jesús

Reputation: 11

Maybe it's late for this, but I think this way is easier:

import pandas as pd
import numpy as np
data = np.arange(6).reshape(3,2)
rng = ['Jan-2016', 'Feb-2016', 'Mar-2016']
df = pd.DataFrame(data, index=rng)
df.index = pd.to_datetime(df.index)

# Create the desired time range
t_index = pd.DatetimeIndex(pd.date_range(start='2016-01-01', end='2016-12-31', freq='h'))
# Resample
df_rsmpld = df.reindex(t_index, method='ffill')

Upvotes: 1

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

UPDATE:

In [37]: (df.set_index(df.index[:-1].union([df.index[-1] + pd.offsets.MonthEnd(0)]))
   ....:    .resample('H')
   ....:    .ffill()
   ....:    .head()
   ....: )
Out[37]:
                     0  1
2016-01-01 00:00:00  0  1
2016-01-01 01:00:00  0  1
2016-01-01 02:00:00  0  1
2016-01-01 03:00:00  0  1
2016-01-01 04:00:00  0  1

In [38]: (df.set_index(df.index[:-1].union([df.index[-1] + pd.offsets.MonthEnd(0)]))
   ....:    .resample('H')
   ....:    .ffill()
   ....:    .tail()
   ....: )
Out[38]:
                     0  1
2016-03-30 20:00:00  2  3
2016-03-30 21:00:00  2  3
2016-03-30 22:00:00  2  3
2016-03-30 23:00:00  2  3
2016-03-31 00:00:00  4  5

Explanation:

In [40]: df.index[-1] + pd.offsets.MonthEnd(0)
Out[40]: Timestamp('2016-03-31 00:00:00')

In [41]: df.index[:-1].union([df.index[-1] + pd.offsets.MonthEnd(0)])
Out[41]: DatetimeIndex(['2016-01-01', '2016-02-01', '2016-03-31'], dtype='datetime64[ns]', freq=None)

Old incorrect answer:

In [77]: df.resample('M').ffill().resample('H').ffill().tail()
Out[77]:
                     0  1
2016-03-30 20:00:00  2  3
2016-03-30 21:00:00  2  3
2016-03-30 22:00:00  2  3
2016-03-30 23:00:00  2  3
2016-03-31 00:00:00  4  5

Upvotes: 4

Related Questions