kryogenic1
kryogenic1

Reputation: 185

Resample daily data to hourly dataframe and copy contents

I have the following Dataframe:

    Date        Holiday
0   2018-01-01  New Year's Day
1   2018-01-15  Martin Luther King, Jr. Day
2   2018-02-19  Washington's Birthday
3   2018-05-08  Truman Day
4   2018-05-28  Memorial Day
...     ...     ...
58  2022-10-10  Columbus Day
59  2022-11-11  Veterans Day
60  2022-11-24  Thanksgiving
61  2022-12-25  Christmas Day
62  2022-12-26  Christmas Day (Observed)

I would like to re-sample this data frame so that it is an hourly df from a daily df (while copying the content in the holidays column to the correct date). I'd like it to look like this [Ignore the index of the table, it should be alot more numbers than this]

    Timestamp           Holiday
0   2018-01-01 00:00:00     New Year's Day
1   2018-01-01 01:00:00     New Year's Day
2   2018-01-01 02:00:00     New Year's Day
3   2018-01-01 03:00:00     New Year's Day
4   2018-01-01 04:00:00     New Year's Day
5   2018-01-01 05:00:00     New Year's Day
...     ...     ...
62  2022-12-26 20:00:00     Christmas Day (Observed)
63  2022-12-26 21:00:00     Christmas Day (Observed)
64  2022-12-26 22:00:00     Christmas Day (Observed)
65  2022-12-26 23:00:00     Christmas Day (Observed)

What's the fastest way to go about doing so? Thanks in advance.

Upvotes: 3

Views: 3133

Answers (2)

Terry
Terry

Reputation: 2811

(1) Create a new DataFrame using date_range, (2) concat this with the original DF, (3) make dates as a column again using reset_index, (4) fill the empty slots using groupby and ffill, (5) sort values and drop duplicates/NaN values.

dates = pd.DataFrame(pd.date_range(df2['date'].min(), df2['date'].max(), freq='H'), columns=['date']).set_index('date')

df3 = pd.concat([df2.set_index('date'), dates], sort = False)

df3.reset_index(inplace = True)

df3['Holiday'] = df3.groupby(df3['date'].dt.date)['Holiday'].ffill()

df3 = df3.sort_values('date').drop_duplicates().dropna(axis = 0)

Upvotes: 0

ignoring_gravity
ignoring_gravity

Reputation: 10486

How about

df.set_index("Date").resample("H").ffill().reset_index().rename(
    {"Date": "Timestamp"}, axis=1
)

Upvotes: 2

Related Questions