abhi1610
abhi1610

Reputation: 743

Merge columns based on values in multiple columns pandas

I have a DataFrame as follows:

   Name  Col2        Col3
0  A     16-1-2000   NaN
1  B     13-2-2001   NaN
2  C     NaN         NaN
3  D     NaN         23-4-2014
4  X     NaN         NaN
5  Q     NaN         4-5-2009

I want to make a combined column based on either data of Col2 & Col3, such it would give me following output.

   Name  Col2        Col3      Result
0  A     16-1-2000   NaN       16-1-2000
1  B     13-2-2001   NaN       13-2-2001
2  C     NaN         NaN       NaN
3  D     NaN         23-4-2014 23-4-2014
4  X     NaN         NaN       NaN
5  Q     NaN         4-5-2009  4-5-2009

I have tried following:

df['Result'] = np.where(df["Col2"].isnull() & df["Col3"].isnull(), np.nan, df["Col2"] if dfCrisiltemp["Col2"].notnull() else df["Col3"])

but no success.

Upvotes: 1

Views: 47

Answers (1)

jezrael
jezrael

Reputation: 863611

Use combine_first or fillna:

df['new'] = df["Col2"].combine_first(df["Col3"])
#alternative
#df['new'] = df["Col2"].fillna(df["Col3"])
print (df)
  Name       Col2       Col3        new
0    A  16-1-2000        NaN  16-1-2000
1    B  13-2-2001        NaN  13-2-2001
2    C        NaN        NaN        NaN
3    D        NaN  23-4-2014  23-4-2014
4    X        NaN        NaN        NaN
5    Q        NaN   4-5-2009   4-5-2009

Your solution should be changed to another np.where:

df['new'] = np.where(df["Col2"].notnull() & df["Col3"].isnull(), df["Col2"],
            np.where(df["Col2"].isnull() & df["Col3"].notnull(), df["Col3"], np.nan))

Or numpy.select:

m1 = df["Col2"].notnull() & df["Col3"].isnull()
m2 = df["Col2"].isnull() & df["Col3"].notnull()
df['new'] = np.select([m1, m2], [df["Col2"], df["Col3"]], np.nan)

For general solution filter all columns without first by iloc, forward fill NaNs and last select last column:

df['new'] = df.iloc[:, 1:].ffill(axis=1).iloc[:, -1]

Upvotes: 1

Related Questions