Kallol
Kallol

Reputation: 2189

Fill rows with consecutive values and above rows using pandas

I have a data frame like this:

df

col1    col2
 1        A 
 3        B
 6        A
 10       C

I want to create a data frame from above df in such a way that, if col1 values are not consecutive, it will create another row with the next col1 value and col2 value will be the just the above value.

the data frame I am looking for should be

df
col1    col2
 1        A
 2        A
 3        B
 4        B
 5        B
 6        A
 7        A
 8        A
 9        A
 10       C

I could do it using a simple for loop, But is there any pythonic way to do it most efficiently using pandas ?

Upvotes: 7

Views: 1320

Answers (2)

yatu
yatu

Reputation: 88236

One way is using reindex with ffill:

(df.set_index('col1')
   .reindex(range(df.col1.iloc[0], df.col1.iloc[-1]+1))
   .ffill()
   .reset_index())

    col1 col2
0     1    A
1     2    A
2     3    B
3     4    B
4     5    B
5     6    A
6     7    A
7     8    A
8     9    A
9    10    C

Or another way using Series.repeat:

df.col2.repeat(df.col1.diff().shift(-1).fillna().reset_index(drop=True)

Upvotes: 3

anky
anky

Reputation: 75080

Here is one way using set_index() and reindex and ffill:

df.set_index('col1').reindex(range(df.col1.min(),df.col1.max()+1)).ffill().reset_index()

#df.set_index('col1').reindex(range(df.col1.min(),df.col1.max()+1),method='ffill')\
                                                     #.reset_index()

   col1 col2
0     1    A
1     2    A
2     3    B
3     4    B
4     5    B
5     6    A
6     7    A
7     8    A
8     9    A
9    10    C

Upvotes: 5

Related Questions