Reputation: 3375
I have the following dataframe with firstname and surname. I want to create a column fullname
.
df1 = pd.DataFrame({'firstname':['jack','john','donald'],
'lastname':[pd.np.nan,'obrien','trump']})
print(df1)
firstname lastname
0 jack NaN
1 john obrien
2 donald trump
This works if there are no NaN
values:
df1['fullname'] = df1['firstname']+df1['lastname']
However since there are NaNs
in my dataframe, I decided to cast to string
first. But it causes a problem in the fullname
column:
df1['fullname'] = str(df1['firstname'])+str(df1['lastname'])
firstname lastname fullname
0 jack NaN 0 jack\n1 john\n2 donald\nName: f...
1 john obrien 0 jack\n1 john\n2 donald\nName: f...
2 donald trump 0 jack\n1 john\n2 donald\nName: f...
I can write some function that checks for nans and inserts the data into the new frame, but before I do that - is there another fast method to combine these strings into one column?
Upvotes: 1
Views: 142
Reputation: 8631
You need to treat NaN
s using .fillna()
Here, you can fill it with ''
.
df1['fullname'] = df1['firstname'] + ' ' +df1['lastname'].fillna('')
Output:
firstname lastname fullname
0 jack NaN jack
1 john obrien john obrien
2 donald trump donald trumpt
Upvotes: 3
Reputation: 59264
You may also use .add
and specify a fill_value
df1.firstname.add(" ").add(df1.lastname, fill_value="")
PS: Chaining too many adds or +
is not recommended for strings, but for one or two columns you should be fine
Upvotes: 1
Reputation: 323226
What I will do (For the case more than two columns need to join)
df1.stack().groupby(level=0).agg(' '.join)
Out[57]:
0 jack
1 john obrien
2 donald trump
dtype: object
Upvotes: 0
Reputation: 7045
There is also Series.str.cat
which can handle NaN
and includes the separator.
df1["fullname"] = df1["firstname"].str.cat(df1["lastname"], sep=" ", na_rep="")
firstname lastname fullname
0 jack NaN jack
1 john obrien john obrien
2 donald trump donald trump
Upvotes: 0