Nayden Van
Nayden Van

Reputation: 1569

pandas concat two column into a new one

I have a csv file with the following column:

timestamp.  message.   name.     DestinationUsername.    sourceUsername
13.05.       hello.     hello.      name1. 
13.05.       hello.     hello.      name2.                  43565                            

what I would like to achieve is to merge together DestinationUsername and SourceUsername into a new column called ID

What I have done so far is the following

f=pd.read_csv('file.csv')
f['userID'] = f.destinationUserName + f.sourceUserName
keep_col = ['@timestamp', 'message', 'name', 'destinationUserName', 'sourceUserName', 'userID']

new_f = f[keep_col]
new_f.to_csv("newFile.csv", index=False)

But this does not work as expected, because in the output I can see if one of the column destinationUserName or sourceUsername is empty, than the userID is empty, the userId get populated only id both destinationUserName and sourceUserName are populated already.

Can anyone help me to understand how I can go over this problem please?

And please if you need more infos just ask me

Upvotes: 1

Views: 95

Answers (2)

varisha15
varisha15

Reputation: 58

This would help

combine=lambda x: '{}{}'.format(x.DestinationUsername,x.sourceUsername)
df['ID'] = df.apply(combine, axis = 1)
df.drop(['DestinationUsername','sourceUsername'],axis=1)

Upvotes: 0

Anurag Dabas
Anurag Dabas

Reputation: 24314

you can typecast the column to string and then remove 'nan' by replace() method:

df['ID']=(df['DestinationUsername'].astype(str)
          +
          df['sourceUsername'].astype(str).replace('nan','',regex=True))
 

OR

df['ID']=df[['DestinationUsername','sourceUsername']].astype(str)
                                                     .agg(''.join,1)
                                                     .replace('nan','',regex=True)

Note: you can also use apply() in place of agg() method

output of df['ID']:

0           name1.
1    name2.43565.0
dtype: object

Upvotes: 1

Related Questions