Reputation: 5
Essentially my issue is thus I have 5 columns which must be combined into a 6th column that lines up all the data (x's in this case) what I would like to do is store the Header as some kind of variable that can be read into something similar to .combine_first. the issue I'm having is that .combine first, as well as a few other series I've attempted to use only allow me to combine things one at a time instead of reading in all the column headers i want combined. the current solution is to combine each element separately and drop all the extra columns. Originally my issues was with getting the Nan to fill now I just need a quicker way to do what I'm doing.
What I need
Var = Name1, Name2, Name3, Name4, Name5
something that combines all the elements
My New Column
Current Solution (Works but too lengthy to be useable)
df['c'] = df['Name1'].combine_first(df['Name2'])
df['c1'] = df['c'].combine_first(df['Name3'])
df['c2'] = df['c1'].combine_first(df['Name4'])
df['Names'] = df['c2'].combine_first(df['Name5'])
df = df.drop(['Name1','Name2','Name3','Name4','Name5','c','c1','c2'], axis=1)
Issue
Name1 Name2 Name3 Name4 Name5
X NaN NaN NaN NaN
X NaN NaN NaN NaN
X NaN NaN NaN NaN
NaN X NaN NaN NaN
NaN X NaN NaN NaN
NaN X NaN NaN NaN
NaN NaN X NaN NaN
NaN NaN X NaN NaN
NaN . NaN X NaN NaN
NaN NaN NaN X NaN
NaN NaN NaN X NaN
NaN NaN NaN X NaN
NaN NaN NaN NaN X
NaN NaN NaN NaN X
NaN NaN NaN NaN X
Desired Result
Names
0 X
1 X
2 X
3 X
4 X
5 X
6 X
7 X
8 X
9 X
10 X
11 X
12 X
13 X
14 X
Upvotes: 0
Views: 374
Reputation: 36
You could try something like the following
df['Names'] = df.apply(lambda x: ''.join(x.dropna().astype(str)),axis=1)
If you ever have multiple items you can add a delimiter in the quotes before the join.
If you are trying to create a new dataframe in one line you can do so like this:
df2= pd.DataFrame(df.apply(lambda x: ''.join(x.dropna().astype(str)),axis=1), columns=['Names'])
Hope this helps! Let me know if you have any questions.
Edit to map columns (I feel like there is a more elegant way to do this but this works):
dfRep = df.loc[:, 'a':'c'].replace('x', pd.Series(df.columns, df.columns))
dfColMap = pd.DataFrame(dfRep.apply(lambda x: ''.join(x.dropna().astype(str)),axis=1), columns=['Map'])
df2['Map'] = dfColMap['Map']
Upvotes: 1
Reputation: 53
If your column values are strings then you can do:
df['Names'] = df.Name1 + df.Name2 + df.Name3 + df.Name4 + df.Name5
You can add spaces or underscores between the names with:
df.Name1 + ' ' + df.Name2 + '_'
Upvotes: 0