Reputation: 2189
I have a data frame like this,
df:
col1 col2 col3
1 cat 4
nan dog nan
3 tiger 3
2 lion 9
nan frog nan
nan elephant nan
I want to create a data frame from this data frame that id there is nan values in col1, col2 values will be added to the previous row value.
for example the desired output data frame will be:
col1 col2 col3
1 catdog 4
3 tiger 3
2 lionfrogelephant 9
How to do this using pandas ?
Upvotes: 1
Views: 401
Reputation: 863801
Use forward filling missing values and aggregate join
:
cols = ['col1','col3']
df[cols] = df[cols].ffill()
df = df.groupby(cols)['col2'].apply(''.join).reset_index()
print (df)
col1 col3 col2
0 1.0 4.0 catdog
1 2.0 9.0 lionfrogelephant
2 3.0 3.0 tiger
Or if necessary forward filling missing values in all columns:
df = df.ffill().groupby(['col1','col3'])['col2'].apply(''.join).reset_index()
print (df)
col1 col3 col2
0 1.0 4.0 catdog
1 2.0 9.0 lionfrogelephant
2 3.0 3.0 tiger
Upvotes: 1