Reputation: 1041
I have a pandas dataframe with an index representing the data (in monthly format) and multiple columns with numeric data. Min Example is below:
dict1 = [{'var0': 45, 'var1': 3, 'var2': 2},
{'var0': 32, 'var1': 4, 'var2': 4},
{'var0': 23, 'var1': 5, 'var2': 8},
{'var0': 22, 'var1': 2, 'var2': 12},]
df = pd.DataFrame(dict1, index=['2016-08', '2016-09','2016-11','2016-12'])
Some of the months are missing however, that is, notice how the index jumps from Sep to Nov. I would like to fill all of the missing months such that the new dataframe contains additional rows with that month as an index and zeros in the respective row, that is:
dict1 = [{'var0': 45, 'var1': 3, 'var2': 2},
{'var0': 32, 'var1': 4, 'var2': 4},
{'var0': 23, 'var1': 5, 'var2': 8},
{'var0': 0, 'var1': 0, 'var2': 0},
{'var0': 22, 'var1': 2, 'var2': 12},]
df = pd.DataFrame(dict1, index=['2016-08'', '2016-09', '2016-09','2016-11','2016-12'])
Can anyone recommend an approach?
Upvotes: 1
Views: 315
Reputation: 862471
Create DatetimeIndex
and use DataFrame.asfreq
:
df.index = pd.to_datetime(df.index)
df = df.asfreq('MS', fill_value=0)
Or DataFrame.reindex
with pandas.date_range
:
df = df.reindex(pd.date_range(df.index.min(), df.index.max(), freq='MS'), fill_value=0)
print(df)
var0 var1 var2
2016-08-01 45 3 2
2016-09-01 32 4 4
2016-10-01 0 0 0
2016-11-01 23 5 8
2016-12-01 22 2 12
Solution with month period
- creating DatetimeIndex.to_period
with pandas.period_range
:
df.index = pd.to_datetime(df.index).to_period('M')
df = df.reindex(pd.period_range(df.index.min(), df.index.max(), freq='M'), fill_value=0)
print(df)
var0 var1 var2
2016-08 45 3 2
2016-09 32 4 4
2016-10 0 0 0
2016-11 23 5 8
2016-12 22 2 12
Last if necessary convert to strings YY-MM
add DatetimeIndex.strftime
:
df.index = df.index.strftime('%Y-%m')
print(df)
var0 var1 var2
2016-08 45 3 2
2016-09 32 4 4
2016-10 0 0 0
2016-11 23 5 8
2016-12 22 2 12
Upvotes: 2