Reputation: 1255

Python combines multiple column in one

I have a pandas dataframe as below:

data = {'A'      :[1,2,3], 
        'B':[2,17,17], 
        'C1'     :["C1",np.nan,np.nan],
       'C2'     :[np.nan,"C2",np.nan]} 

# Create DataFrame 
df = pd.DataFrame(data) 
df


    A   B   C1  C2
0   1   2   C1  NaN
1   2   17  NaN C2
2   3   17  NaN NaN

I want to create a variable "C" based on "C1" and"C2"(there could be "C4", "C5". If any of C's has the value "C"= value from C's(C1, C2, C3....). My output in this case should look like below:

    A   B   C1  C2   C
0   1   2   C1  NaN  C1
1   2   17  NaN C2   C2
2   3   17  NaN NaN  NaN

Upvotes: 1

Answers (3)

BENY

Reputation: 323276

IIUC

df['C']=df.filter(like='C').bfill(axis=1).iloc[:,0]
df
   A   B   C1   C2    C
0  1   2   C1  NaN   C1
1  2  17  NaN   C2   C2
2  3  17  NaN  NaN  NaN

Upvotes: 1

Umar.H

Reputation: 23099

IIUC,

we can filter your columns by the word C then aggregate the values with an agg call:

df['C'] = df.filter(regex='C\d+').stack().groupby(level=0).agg(','.join)

print(df)

   A   B   C1   C2    C
0  1   2   C1  NaN   C1
1  2  17  NaN   C2   C2
2  3  17  NaN  NaN  NaN

Upvotes: 0

Andy L.

Reputation: 25239

Try this

df1 = df.filter(regex='^C\d+')
df['C'] = df1[df1.isin(df1.columns)].bfill(1).iloc[:,0]

Out[117]:
   A   B   C1   C2    C
0  1   2   C1  NaN   C1
1  2  17  NaN   C2   C2
2  3  17  NaN  NaN  NaN

If you want to strictly compare values matching to its own column name, Use eq instead of isin as follows

df['C'] = df1[df1.eq(df1.columns, axis=1)].bfill(1).iloc[:,0]

Upvotes: 3

Python combines multiple column in one

Answers (3)

Related Questions