Keenan Burke-Pitts
Keenan Burke-Pitts

Reputation: 475

Transform Multiple Columns Into One With Pandas

I have a dataframe structured as such: dataframe snapshot

I am wondering what the most efficient way is in pandas to create a new column "stage" that extracts any value that isn't 'None' in the four columns and use that value for the 'stage' column. The remaining four columns can then be dropped after the stage column has extracted out any value that isn't None in each row.

Here is another snapshot of the unique values of each column involved: unique values

Please note that the values in the columns in question are string type and None isn't actually Nonetype.

Upvotes: 0

Views: 3952

Answers (2)

Parfait
Parfait

Reputation: 107767

Consider combine_first, assuming None is not a string literal 'None'.

df['stage'] = df['doggo'].combine_first(df['floorfer'])\
                         .combine_first(df['pupper'])\ 
                         .combine_first(df['puppo'])

Alternatively, for DRY-er approach, use reduce:

from functools import reduce
...

df['stage'] = reduce(lambda x,y: x.combine_first(y), 
                     [df['doggo'], df['floorfer'], df['pupper'], df['puppo']])

Upvotes: 5

BENY
BENY

Reputation: 323396

df['New']=df[['A','B','C']].replace('None','').sum(1)
df
Out[1477]: 
      A     B     C New
0  None     B  None   B
1     A  None  None   A
2  None  None     C   C

Data input

df=pd.DataFrame({'A':['None','A','None'],'B':['B','None','None'],'C':['None','None','C']})

Upvotes: 2

Related Questions