user7020864
user7020864

Reputation: 11

Insert rows into a Pandas DataFrame where rows are missing based on previous row values

I am new to Python and this is my first post, so I apologize for any ambiguous phrasing.

I have a table with column A that increments from 1 to 5 for several iterations. I'd like to scan column A and where this pattern doesn't match insert the correct number for A, copy column C and leave a missing value for column B.

Just inserting a row with missing values at the correct place would be helpful.

Example

Upvotes: 1

Views: 492

Answers (1)

jezrael
jezrael

Reputation: 862581

You can reindex by MultiIndex.from_product and then fill missing values in column C by ffill:

df['G'] = (df.A.diff().fillna(-1) < 1).cumsum()
df.set_index(['G','A'], inplace=True)
print (df)
       B    C
G A          
1 1    1  Feb
  2    8  Feb
  4   64  Feb
  5  125  Feb
2 1    0  Feb
  3    6  Feb
  4   16  Feb
  5   31  Feb
3 1   -3  Feb
  3    4  Feb
  4   18  Feb
  5   29  Feb
mux = pd.MultiIndex.from_product([df.index.get_level_values('G').unique(), 
                                  np.arange(1,6)], names=('G','A'))

df = df.reindex(mux)
df.C = df.C.ffill()

df = df.reset_index(level=0, drop=True).reset_index()
print (df)
    A      B    C
0   1    1.0  Feb
1   2    8.0  Feb
2   3    NaN  Feb
3   4   64.0  Feb
4   5  125.0  Feb
5   1    0.0  Feb
6   2    NaN  Feb
7   3    6.0  Feb
8   4   16.0  Feb
9   5   31.0  Feb
10  1   -3.0  Feb
11  2    NaN  Feb
12  3    4.0  Feb
13  4   18.0  Feb
14  5   29.0  Feb

Upvotes: 2

Related Questions