Reputation: 75
Input dataset:
Var1 Var2 Var3 Var4
101 XXX yyyy 12/10/2014
101 XYZ YTRT 13/10/2014
102 TTY UUUU 9/9/2014
102 YTY IUYY 10/10/2014
Expected dataset:
Var1 Var2 Var3 Var4 New_Variable
101 XXX yyyy 12/10/2014 XXX, yyyy
101 XYZ YTRT 13/10/2014 XYZ, YTRT
102 TTY UUUU 9/9/2014 TTY, UUUU
102 YTY IUYY 10/10/2014 YTY, IUYY
How can I concatenate two or more string variables and create a new variable capturing the concatenated values in the same dataset?
Upvotes: 2
Views: 3766
Reputation: 176938
You can use the cat
method.
Here's an example:
>>> df = pd.DataFrame({'a':['x','y','z'], 'b': ['x','y','z'], 'c': ['x','y','z']})
>>> df
a b c
0 x x x
1 y y y
2 z z z
Now you can create a new column using the cat
method on one of your chosen columns. Specify the other columns you'd like to concatenate with the others
argument and your delimiter with the sep
argument:
>>> df["new"] = df.a.str.cat(others=[df.b, df.c], sep=', ')
>>> df
a b c new
0 x x x x, x, x
1 y y y y, y, y
2 z z z z, z, z
Upvotes: 2