codetech
codetech

Reputation: 113

Fill in missing values based on series and populate second row based on previous or next row

I have a csv with 4 columns. The file contains some missing rows based on the series.

Input:-

No  A   B   C
1   10  50  12
3   40  50  12
4   20  60  15
6   80  80  18

Output:-

No  A   B   C
1   10  50  12
2   10  50  12
3   40  50  12
4   20  60  15
5   20  60  15
6   80  80  18

I need python and pandas code to generate the above output.

Upvotes: 1

Views: 115

Answers (2)

Vishnudev Krishnadas
Vishnudev Krishnadas

Reputation: 10960

Create a dataframe of your missing rows

missing_list = [[i] + [pd.np.nan]*(df.shape[1] - 1) for i in range(df.No.min(), df.No.max()) if i not in df.No]
missing_df = pd.DataFrame(missing_list, columns=df.columns)

Concat to original dataframe, sort and forward fill

pd.concat([df, missing_df]).sort_values('No').ffill()

Upvotes: 0

jezrael
jezrael

Reputation: 862481

Use if No is column - create index by No and DataFrame.reindex by range with all possible values:

v = range(df['No'].min(), df['No'].max() + 1)
df1 = df.set_index('No').reindex(v, method='ffill').reset_index()
print (df1)
   No   A   B   C
0   1  10  50  12
1   2  10  50  12
2   3  40  50  12
3   4  20  60  15
4   5  20  60  15
5   6  80  80  18

Use if No is index solution is changed a bit:

v = range(df.index.min(), df.index.max() + 1)
df1 = df.reindex(v, method='ffill')
print (df1)
     A   B   C
No            
1   10  50  12
2   10  50  12
3   40  50  12
4   20  60  15
5   20  60  15
6   80  80  18

Upvotes: 1

Related Questions