Assasins creed
Assasins creed

Reputation: 1

How to combine multiple columns to single column

This is the sample data provided and I need to combine the columns to single column which consists of all the fruit names without any Nan Values and also total column should'nt be changed.

|Fruit1  | Fruit2| Fruit3| Total|
|:------ | :----:| :----:|-----:|
|NaN     | Apple | NaN.  | 20.  |
|Pear    | NaN   | NaN.  | 40.  |
|NaN.    | NaN   | orange| 50.  |
|Mango   | NaN   | NaN.  | 43.  |
|NaN     | banana| NaN.  | 35.  |

This should be the output:

|Fruits  | Total|
|------- | -----|
|Apple   | 20.  |
|Pear    | 40.  |
|Orange  | 50.  |
|Mango   | 43.  |
|banana  | 35.  |

Upvotes: 0

Views: 750

Answers (5)

mozway
mozway

Reputation: 262634

Assuming you have only one non-NaN per row, you can stack:

df.stack().droplevel(1).to_frame(name='Fruits')

Output:

   Fruits
0   Apple
1    Pear
2  Orange
3   Mango
4  banana

Handling rows with only NaNs:

df.stack().droplevel(1).to_frame(name='Fruits').reindex(df.index)

Output assuming banana is a NaN:

   Fruits
0   Apple
1    Pear
2  Orange
3   Mango
4     NaN

Upvotes: 1

Michael Dorner
Michael Dorner

Reputation: 20205

I would use bfill():

df = pd.DataFrame({
    'fruit_1': [None, 'Pear', None, None], 
    'fruit_2': ['Apple', None, None, None], 
    'fruit_3': [None, None, 'Orange', None]})

df.bfill(axis=1).iloc[:,0].rename('fruits') # returns
0     Apple
1      Pear
2    Orange
3      None
Name: fruits, dtype: object

(or ffill() and use the last column)

It also works for rows containing None only.

Upvotes: 2

DarrylG
DarrylG

Reputation: 17176

We can use a modification of an approach from Transform Multiple Columns Into One With Pandas to combine columns:

df['new'] = df.fillna('').sum(1) 

Explanation

  • replace all nan values with an empty string
  • sum(1), is summing the df row by row. Since values in row are strings, it will join them together

Example Usage

from io import StringIO

# Create DataFrame from OP data
s = '''Fruit1,Fruit2,Fruit3
NaN,Apple,NaN
Pear,NaN,NaN
NaN,NaN,Orange
Mango,NaN,NaN
NaN,banana,NaN'''

df = pd.read_csv(StringIO(s))
print(df)

Initial DataFrame

    Fruit1  Fruit2  Fruit3
0   NaN     Apple   NaN
1   Pear    NaN     NaN
2   NaN     NaN     Orange
3   Mango   NaN     NaN
4   NaN     banana  NaN

df['New']=df.fillna('').sum(1)
print(df)

Updated DataFrame

    Fruit1  Fruit2  Fruit3  new
0   NaN     Apple   NaN     Apple
1   Pear    NaN     NaN     Pear
2   NaN     NaN     Orange  Orange
3   Mango   NaN     NaN     Mango
4   NaN     banana  NaN     banana

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522817

We can use combine_first here:

df["Fruits"] = df["Fruit1"].combine_first(df["Fruit2"])

We can also use np.where:

df["Fruits"] = np.where(df["Fruit1"].isnull(), df["Fruit2"], df["Fruit1"])

Upvotes: 0

new2cod3
new2cod3

Reputation: 31

I think this should give the desired output -

df['Fruit1'].fillna(df['Fruit2'])

Upvotes: 0

Related Questions