user12278995
user12278995

Reputation: 5

combining multiple columns in a single line with headers as variables

Essentially my issue is thus I have 5 columns which must be combined into a 6th column that lines up all the data (x's in this case) what I would like to do is store the Header as some kind of variable that can be read into something similar to .combine_first. the issue I'm having is that .combine first, as well as a few other series I've attempted to use only allow me to combine things one at a time instead of reading in all the column headers i want combined. the current solution is to combine each element separately and drop all the extra columns. Originally my issues was with getting the Nan to fill now I just need a quicker way to do what I'm doing.

What I need

Var = Name1, Name2, Name3, Name4, Name5 

    something that combines all the elements

 My New Column

Current Solution (Works but too lengthy to be useable)

df['c'] = df['Name1'].combine_first(df['Name2'])
df['c1'] = df['c'].combine_first(df['Name3'])
df['c2'] = df['c1'].combine_first(df['Name4'])
df['Names'] = df['c2'].combine_first(df['Name5'])
df = df.drop(['Name1','Name2','Name3','Name4','Name5','c','c1','c2'], axis=1)

Issue

       Name1    Name2   Name3   Name4   Name5

          X      NaN     NaN    NaN      NaN
          X      NaN     NaN    NaN      NaN
          X      NaN     NaN    NaN      NaN
        NaN       X      NaN    NaN      NaN
        NaN       X      NaN    NaN      NaN
        NaN       X      NaN    NaN      NaN
        NaN      NaN      X     NaN      NaN
        NaN      NaN      X     NaN      NaN
        NaN .    NaN      X     NaN      NaN
        NaN      NaN     NaN    X        NaN
        NaN      NaN     NaN    X        NaN
        NaN      NaN     NaN    X        NaN
        NaN      NaN     NaN    NaN       X
        NaN      NaN     NaN    NaN       X
        NaN      NaN     NaN    NaN       X

Desired Result

Names
0   X
1   X
2   X
3   X
4   X
5   X
6   X
7   X
8   X
9   X
10  X
11  X
12  X
13  X
14  X

Upvotes: 0

Views: 374

Answers (2)

SpaceTristan
SpaceTristan

Reputation: 36

You could try something like the following

df['Names'] = df.apply(lambda x: ''.join(x.dropna().astype(str)),axis=1)

If you ever have multiple items you can add a delimiter in the quotes before the join.

If you are trying to create a new dataframe in one line you can do so like this:

df2= pd.DataFrame(df.apply(lambda x: ''.join(x.dropna().astype(str)),axis=1), columns=['Names'])

Hope this helps! Let me know if you have any questions.

Edit to map columns (I feel like there is a more elegant way to do this but this works):

dfRep = df.loc[:, 'a':'c'].replace('x', pd.Series(df.columns, df.columns))
dfColMap = pd.DataFrame(dfRep.apply(lambda x: ''.join(x.dropna().astype(str)),axis=1), columns=['Map'])
df2['Map'] = dfColMap['Map']

Upvotes: 1

joe-cormier
joe-cormier

Reputation: 53

If your column values are strings then you can do:

df['Names'] = df.Name1 + df.Name2 + df.Name3 + df.Name4 + df.Name5

You can add spaces or underscores between the names with:

df.Name1 + ' ' + df.Name2 + '_' 

Upvotes: 0

Related Questions