Reputation: 637
Let us say I have the following pandas dataframe:
+---------------------+---------+-------+-----+
| observed_cats_count | year | month | day |
+---------------------+---------+-------+-----+
| 2 | 2019 | 10 | 19 |
| 3 | 2019 | 10 | 18 |
| 5 | 2019 | 10 | 16 |
+---------------------+---------+-------+-----+
And two boundary dates, say 2019-10-15
and 2019-10-20
, and I know that all missing observations should have observed_cats_count=0
.
How can insert a row for all missing dates in the interval and get the following dataframe:
+---------------------+---------+-------+-----+
| observed_cats_count | year | month | day |
+---------------------+---------+-------+-----+
| 0 | 2019 | 10 | 20 |
| 2 | 2019 | 10 | 19 |
| 3 | 2019 | 10 | 18 |
| 0 | 2019 | 10 | 17 |
| 5 | 2019 | 10 | 16 |
| 0 | 2019 | 10 | 15 |
+---------------------+---------+-------+-----+
Upvotes: 1
Views: 116
Reputation: 25239
I would use pd.date_range
to construct a new dataframe and merge back to df
and fillna
dates = pd.date_range('2019-10-20', '2019-10-15', freq='-1D')
df1 = pd.DataFrame({'year': dates.year, 'month': dates.month, 'day': dates.day})
df2 = df1.merge(df, how='left').fillna(0)
Out[413]:
year month day observed_cats_count
0 2019 10 20 0.0
1 2019 10 19 2.0
2 2019 10 18 3.0
3 2019 10 17 0.0
4 2019 10 16 5.0
5 2019 10 15 0.0
Upvotes: 2
Reputation: 862451
Idea is create DatetimeIndex
by to_datetime
with DataFrame.set_index
, so possible use Series.reindex
by all datetimes created by date_range
, then create columns from DatetimeIndex
, sorting it by DataFrame.sort_index
and last remove it by DataFrame.reset_index
with drop=True
:
rng = pd.date_range('2019-10-15','2019-10-20')
df = (df.set_index(pd.to_datetime(df[['year','month','day']]))['observed_cats_count']
.reindex(rng, fill_value=0).to_frame()
.assign(year=lambda x: x.index.year,
month=lambda x: x.index.month,
day=lambda x: x.index.day)
.sort_index(ascending=False)
.reset_index(drop=True))
print (df)
observed_cats_count year month day
0 0 2019 10 20
1 2 2019 10 19
2 3 2019 10 18
3 0 2019 10 17
4 5 2019 10 16
5 0 2019 10 15
Upvotes: 1