Reputation: 197
I have a dataframe
as show below:
df =
index column1 column2 column3 column4
2014-5-21 2 3.4 4.3 3
2014-5-22 34 5 2 666
...
2014-12-31 9 4.3 4.3 1
and I would like to create a function like below:
def fullyear(df)
when I create a dataframe
with insufficient datatime
, it would like it to return a new dataframe
like this:
index column1 column2 column3 column4
2014-1-1 NaN NaN NaN NaN
2014-1-2 NaN NaN NaN NaN
...
2014-5-21 2 3.4 4.3 3
2014-5-22 34 5 2 666
...
2014-12-31 9 4.3 4.3 1
the missing date would be automatically filled and the data in columns would be arragned by NaN
And the date in dataframe
is random so I still don't have a good idea how to solve this. Anyone has an idea to solve this? Thanks in advance!
Upvotes: 1
Views: 37
Reputation: 862711
Use reindex
by date_range
:
idx = pd.date_range('2014-01-01', '2014-12-31')
df.index = pd.to_datetime(df.index)
df = df.reindex(idx)
For more dynamic solution is possible generate min
and max
year:
df.index = pd.to_datetime(df.index)
y = df.index.year
idx = pd.date_range('{}-01-01'.format(y.min()), '{}-12-31'.format(y.max()))
df = df.reindex(idx)
print (df.tail())
column1 column2 column3 column4
2014-12-27 NaN NaN NaN NaN
2014-12-28 NaN NaN NaN NaN
2014-12-29 NaN NaN NaN NaN
2014-12-30 NaN NaN NaN NaN
2014-12-31 9.0 4.3 4.3 1.0
And last wrap it to function:
def fullyear(df):
df.index = pd.to_datetime(df.index)
y = df.index.year
idx = pd.date_range('{}-01-01'.format(y.min()), '{}-12-31'.format(y.max()))
return df.reindex(idx)
df1 = fullyear(df)
Upvotes: 1