Fill dates with last value until current time

Question

I've a dataframe with the following structure:

Time	Company	Product_type	Total_sales
2021-01-31 06:00:00+00:00	Adidas	Shoes	20
2021-01-31 05:00:00+00:00	Adidas	Shoes	13
2021-01-31 03:00:00+00:00	Adidas	Shoes	4
2021-01-31 03:00:00+00:00	Nike	Shoes	5
2021-01-31 02:00:00+00:00	Adidas	Shoes	3
2021-01-31 02:00:00+00:00	Nike	Shoes	3

What I needed was to "fill" the time_hour gaps with the nearest previous value (time) according to their company and product_type. In this case, for Adidas, a row for the 04:00 is missing so it'll need to be filled with 4, the value from 03:00 sales.

That was solved in the following link so I now have this:

Time	Company	Product_type	Total_sales
2021-01-31 06:00:00+00:00	Adidas	Shoes	20
2021-01-31 05:00:00+00:00	Adidas	Shoes	13
2021-01-31 04:00:00+00:00	Adidas	Shoes	4
2021-01-31 03:00:00+00:00	Adidas	Shoes	4
2021-01-31 03:00:00+00:00	Nike	Shoes	5
2021-01-31 02:00:00+00:00	Adidas	Shoes	3
2021-01-31 02:00:00+00:00	Nike	Shoes	3

Now what I want to do is to fill it with the dates until the current time. For example today is 2021-02-05 so I need all the hours (rows) between 2021-01-31 06:00:00 and 2021-02-05 to be added with "20" for Adidas Shoes as the last total_sales value says.

I've thought about simply adding a row with the last value and the current date but it seems it won't work and would add a lot of possible bugs.

r-beginners · Accepted Answer

The code I've created is limited to Adidas, but you can combine Nike in the same way. The main point is to update the original dataframe by creating a pd.datetime_range() from the last data of the real data to today. At that time, we will fill in the holes backward.

df['Time'] = pd.to_datetime(df['Time'])
df.set_index('Time', inplace=True)
df_adidas = df.loc[df['Company'] == 'Adidas']
tidx = pd.date_range(df_adidas.head(1).index.values[0], datetime.date.today(), freq='1H', tz='UTC')
df_adidas.reindex(tidx, method='bfill')

    Company     Product_type    Total_sales
2021-01-31 06:00:00+00:00   Adidas  Shoes   20
2021-01-31 07:00:00+00:00   Adidas  Shoes   20
2021-01-31 08:00:00+00:00   Adidas  Shoes   20
2021-01-31 09:00:00+00:00   Adidas  Shoes   20
2021-01-31 10:00:00+00:00   Adidas  Shoes   20
...     ...     ...     ...
2021-02-05 20:00:00+00:00   Adidas  Shoes   20
2021-02-05 21:00:00+00:00   Adidas  Shoes   20
2021-02-05 22:00:00+00:00   Adidas  Shoes   20
2021-02-05 23:00:00+00:00   Adidas  Shoes   20
2021-02-06 00:00:00+00:00   Adidas  Shoes   20

Fill dates with last value until current time

Answers (2)

Related Questions