Insert Missing Months rows in the dataframe in python

Question

I have the following dataframe:

Input:-

ID month   Name
A1 2017.01 A
A1 2017.02 B
A1 2017.04 C
A2 2017.02 A
A2 2017.03 D
A2 2017.05 C

Output:-

ID month   Name
A1 2017.01 A
A1 2017.02 B
A1 2017.03 B
A1 2017.04 C
A2 2017.02 A
A2 2017.03 D
A2 2017.04 D
A2 2017.05 C

I require to get the missing months in the sequence and the value of the month preceding it and which is present in the input list. Consider example of ID "A1". "A1" has months 1,2,4 and has missing month 3. So i need to add the row with value "A1" as ID, month as "2017.03" and Name as "B". Please note the "Name" column should get its value from the row immediately above it that is present in the input.

How do I achieve this in pandas, or by any other method in python.

Any help is appreciated! Thanks

Scott Boston · Accepted Answer

Let's try this with @EFT's suggestion:

df['Date'] = pd.to_datetime(df.month.astype(str),format='%Y.%m')
df_out = df.set_index('Date').groupby('ID').resample('MS').asfreq().ffill().reset_index(level=0, drop=True)
df_out = df_out.reset_index()
df_out['month'] = df_out.Date.dt.strftime('%Y.%m')
df_out = df_out.drop('Date',axis=1)
print(df_out)

Output:

   ID    month Name
0  A1  2017.01    A
1  A1  2017.02    B
2  A1  2017.03    B
3  A1  2017.04    C
4  A2  2017.02    A
5  A2  2017.03    D
6  A2  2017.04    D
7  A2  2017.05    C

Insert Missing Months rows in the dataframe in python

Answers (2)

Related Questions