Reputation: 185
I have the following Dataframe:
Date Holiday
0 2018-01-01 New Year's Day
1 2018-01-15 Martin Luther King, Jr. Day
2 2018-02-19 Washington's Birthday
3 2018-05-08 Truman Day
4 2018-05-28 Memorial Day
... ... ...
58 2022-10-10 Columbus Day
59 2022-11-11 Veterans Day
60 2022-11-24 Thanksgiving
61 2022-12-25 Christmas Day
62 2022-12-26 Christmas Day (Observed)
I would like to re-sample this data frame so that it is an hourly df from a daily df (while copying the content in the holidays column to the correct date). I'd like it to look like this [Ignore the index of the table, it should be alot more numbers than this]
Timestamp Holiday
0 2018-01-01 00:00:00 New Year's Day
1 2018-01-01 01:00:00 New Year's Day
2 2018-01-01 02:00:00 New Year's Day
3 2018-01-01 03:00:00 New Year's Day
4 2018-01-01 04:00:00 New Year's Day
5 2018-01-01 05:00:00 New Year's Day
... ... ...
62 2022-12-26 20:00:00 Christmas Day (Observed)
63 2022-12-26 21:00:00 Christmas Day (Observed)
64 2022-12-26 22:00:00 Christmas Day (Observed)
65 2022-12-26 23:00:00 Christmas Day (Observed)
What's the fastest way to go about doing so? Thanks in advance.
Upvotes: 3
Views: 3133
Reputation: 2811
(1) Create a new DataFrame using date_range
, (2) concat this with the original DF, (3) make dates as a column again using reset_index
, (4) fill the empty slots using groupby
and ffill
, (5) sort values and drop duplicates/NaN values.
dates = pd.DataFrame(pd.date_range(df2['date'].min(), df2['date'].max(), freq='H'), columns=['date']).set_index('date')
df3 = pd.concat([df2.set_index('date'), dates], sort = False)
df3.reset_index(inplace = True)
df3['Holiday'] = df3.groupby(df3['date'].dt.date)['Holiday'].ffill()
df3 = df3.sort_values('date').drop_duplicates().dropna(axis = 0)
Upvotes: 0
Reputation: 10486
How about
df.set_index("Date").resample("H").ffill().reset_index().rename(
{"Date": "Timestamp"}, axis=1
)
Upvotes: 2