Concatenate two columns

Question

I have two text columns A and B. I want to take the first non empty string or if both A and B has values take the values from A. C is the column im trying to create:

import pandas as pd

cols = ['A','B']
data = [['data','data'],
        ['','data'],
        ['',''],
        ['data1','data2']]

df = pd.DataFrame.from_records(data=data, columns=cols)

      A      B
0  data   data
1         data
2             
3  data1  data2

My attempt:

df['C'] = df[cols].apply(lambda row: sorted([val if val else '' for val in row], reverse=True)[0], axis=1) #Reverse sort to avoid picking an empty string

      A      B      C
0  data   data      data
1         data      data
2             
3  data1  data2     data2 #I want data1 here

Expected output:

      A      B      C
0  data   data      data
1         data      data
2             
3  data1  data2     data1

I think I want the pandas equivalent of SQL coalesce.

Mayank Porwal · Accepted Answer

You can also use numpy.where:

In [1022]: import numpy as np
In [1023]: df['C'] = np.where(df['A'].eq(''), df['B'], df['A'])

In [1024]: df
Out[1024]: 
       A      B      C
0   data   data   data
1          data   data
2                     
3  data1  data2  data1

Concatenate two columns

Answers (2)

Related Questions