splinter
splinter

Reputation: 3897

How to fill in an incrementing integer in Pandas

Given a pd.DataFrame such as:

print(pd.DataFrame([['a', 0, 'b'], ['c', 1, 'd'], ['f', 4, 'e']]))
   0  1  2
0  a  0  b
1  c  1  d
2  f  4  e

I would like to "fill in" rows by incrementing on the integer column. That is, I would like to obtain:

     0  1    2
0    a  0    b
1    c  1    d
2  NaN  2  NaN
3  NaN  3  NaN
4    f  4    e

As I am will use this within a groupby operation in a large dataset I am looking for the most efficient code to do this.

Upvotes: 0

Views: 1092

Answers (1)

DSM
DSM

Reputation: 353209

You could turn your 1 column into an index and reindex using it:

In [33]: df.set_index(1).reindex(range(df[1].iloc[0], df[1].iloc[-1]+1)).reset_index()
Out[33]: 
   1    0    2
0  0    a    b
1  1    c    d
2  2  NaN  NaN
3  3  NaN  NaN
4  4    f    e

and then you could reorder the columns if you cared.

Don't know about performance, but frankly custom groupby operations are pretty slow to start with. If speed is really critical, your best bet is to move this incrementing operation out of the groupby entirely if you can pull it off.

Upvotes: 2

Related Questions