pandas concat two column into a new one

Question

I have a csv file with the following column:

timestamp.  message.   name.     DestinationUsername.    sourceUsername
13.05.       hello.     hello.      name1. 
13.05.       hello.     hello.      name2.                  43565

what I would like to achieve is to merge together DestinationUsername and SourceUsername into a new column called ID

What I have done so far is the following

f=pd.read_csv('file.csv')
f['userID'] = f.destinationUserName + f.sourceUserName
keep_col = ['@timestamp', 'message', 'name', 'destinationUserName', 'sourceUserName', 'userID']

new_f = f[keep_col]
new_f.to_csv("newFile.csv", index=False)

But this does not work as expected, because in the output I can see if one of the column destinationUserName or sourceUsername is empty, than the userID is empty, the userId get populated only id both destinationUserName and sourceUserName are populated already.

Can anyone help me to understand how I can go over this problem please?

And please if you need more infos just ask me

Anurag Dabas · Accepted Answer

you can typecast the column to string and then remove 'nan' by replace() method:

df['ID']=(df['DestinationUsername'].astype(str)
          +
          df['sourceUsername'].astype(str).replace('nan','',regex=True))

OR

df['ID']=df[['DestinationUsername','sourceUsername']].astype(str)
                                                     .agg(''.join,1)
                                                     .replace('nan','',regex=True)

Note: you can also use apply() in place of agg() method

output of df['ID']:

0           name1.
1    name2.43565.0
dtype: object

pandas concat two column into a new one

Answers (2)

This would help

Related Questions