How to fill in an incrementing integer in Pandas

Question

Given a pd.DataFrame such as:

print(pd.DataFrame([['a', 0, 'b'], ['c', 1, 'd'], ['f', 4, 'e']]))
   0  1  2
0  a  0  b
1  c  1  d
2  f  4  e

I would like to "fill in" rows by incrementing on the integer column. That is, I would like to obtain:

     0  1    2
0    a  0    b
1    c  1    d
2  NaN  2  NaN
3  NaN  3  NaN
4    f  4    e

As I am will use this within a groupby operation in a large dataset I am looking for the most efficient code to do this.

DSM · Accepted Answer

You could turn your 1 column into an index and reindex using it:

In [33]: df.set_index(1).reindex(range(df[1].iloc[0], df[1].iloc[-1]+1)).reset_index()
Out[33]: 
   1    0    2
0  0    a    b
1  1    c    d
2  2  NaN  NaN
3  3  NaN  NaN
4  4    f    e

and then you could reorder the columns if you cared.

Don't know about performance, but frankly custom groupby operations are pretty slow to start with. If speed is really critical, your best bet is to move this incrementing operation out of the groupby entirely if you can pull it off.

How to fill in an incrementing integer in Pandas

Answers (1)

Related Questions