Reputation: 568
I have a dataframe similar to the one seen below.
In[2]: df = pd.DataFrame({'P1': [1, 2, None, None, None, None],'P2': [None, None, 3, 4, None, None],'P3': [None, None, None, None, 5, 6]})
Out[2]:
P1 P2 P3
0 1.0 NaN NaN
1 2.0 NaN NaN
2 NaN 3.0 NaN
3 NaN 4.0 NaN
4 NaN NaN 5.0
5 NaN NaN 6.0
And I am trying to merge all of the columns into a single P
column in a new dataframe (see below).
P
0 1.0
1 2.0
2 3.0
3 4.0
4 5.0
5 6.0
In my actual code, I have an arbitrary list of columns that should be merged, not necessarily P1
, P2
, and P3
(between 1 and 5 columns). I've tried something along the following lines:
new_series = pd.Series()
desired_columns = ['P1', 'P2', 'P3']
for col in desired_columns:
other_series=df[col]
new_series = new_series.align(other_series)
However this results in a tuple of Series objects, and neither of them appear to contain the data I need. I could iterate through every row, then check each column, but I feel that there is likely an easy pandas solution that I am missing.
Upvotes: 1
Views: 67
Reputation: 8816
Another alternate solution:
So, if we are not column specific within the DataFrame to choose about then we can use bfill() function to populate the non-nan values in the dataframe across columns So, when axis='columns'
, then the current nan cells will be filled from the value present in the next column in the same row.
>>> df['P'] = df.bfill(axis=1).iloc[:, 0]
>>> df
P1 P2 P3 P
0 1.0 NaN NaN 1.0
1 2.0 NaN NaN 2.0
2 NaN 3.0 NaN 3.0
3 NaN 4.0 NaN 4.0
4 NaN NaN 5.0 5.0
5 NaN NaN 6.0 6.0
Upvotes: 0
Reputation: 862511
If there is only one non None value per row forward filling None
s and select last column by position:
df['P'] = df[['P1', 'P2', 'P3']].ffill(axis=1).iloc[:, -1]
print (df)
P1 P2 P3 P
0 1.0 NaN NaN 1.0
1 2.0 NaN NaN 2.0
2 NaN 3.0 NaN 3.0
3 NaN 4.0 NaN 4.0
4 NaN NaN 5.0 5.0
5 NaN NaN 6.0 6.0
Upvotes: 5