Reputation: 2189
I have a data frame like this:
df
col1 col2
1 A
3 B
6 A
10 C
I want to create a data frame from above df in such a way that, if col1 values are not consecutive, it will create another row with the next col1 value and col2 value will be the just the above value.
the data frame I am looking for should be
df
col1 col2
1 A
2 A
3 B
4 B
5 B
6 A
7 A
8 A
9 A
10 C
I could do it using a simple for loop, But is there any pythonic way to do it most efficiently using pandas ?
Upvotes: 7
Views: 1320
Reputation: 88236
One way is using reindex
with ffill
:
(df.set_index('col1')
.reindex(range(df.col1.iloc[0], df.col1.iloc[-1]+1))
.ffill()
.reset_index())
col1 col2
0 1 A
1 2 A
2 3 B
3 4 B
4 5 B
5 6 A
6 7 A
7 8 A
8 9 A
9 10 C
Or another way using Series.repeat
:
df.col2.repeat(df.col1.diff().shift(-1).fillna().reset_index(drop=True)
Upvotes: 3
Reputation: 75080
Here is one way using set_index()
and reindex
and ffill
:
df.set_index('col1').reindex(range(df.col1.min(),df.col1.max()+1)).ffill().reset_index()
#df.set_index('col1').reindex(range(df.col1.min(),df.col1.max()+1),method='ffill')\
#.reset_index()
col1 col2
0 1 A
1 2 A
2 3 B
3 4 B
4 5 B
5 6 A
6 7 A
7 8 A
8 9 A
9 10 C
Upvotes: 5