user3738411
user3738411

Reputation: 75

Concatenate two or more variables using Pandas to create a new variable

Input dataset:

Var1     Var2   Var3    Var4

101 XXX       yyyy   12/10/2014

101  XYZ      YTRT  13/10/2014

102  TTY       UUUU  9/9/2014

102  YTY      IUYY   10/10/2014

Expected dataset:

Var1     Var2   Var3    Var4         New_Variable

101 XXX       yyyy   12/10/2014       XXX, yyyy

101  XYZ      YTRT  13/10/2014        XYZ, YTRT

102  TTY       UUUU  9/9/2014         TTY, UUUU

102  YTY      IUYY   10/10/2014       YTY, IUYY

How can I concatenate two or more string variables and create a new variable capturing the concatenated values in the same dataset?

Upvotes: 2

Views: 3766

Answers (1)

Alex Riley
Alex Riley

Reputation: 176938

You can use the cat method.

Here's an example:

>>> df = pd.DataFrame({'a':['x','y','z'], 'b': ['x','y','z'], 'c': ['x','y','z']})
>>> df
   a  b  c
0  x  x  x
1  y  y  y
2  z  z  z

Now you can create a new column using the cat method on one of your chosen columns. Specify the other columns you'd like to concatenate with the others argument and your delimiter with the sep argument:

>>> df["new"] = df.a.str.cat(others=[df.b, df.c], sep=', ')
>>> df
   a  b  c      new
0  x  x  x  x, x, x
1  y  y  y  y, y, y
2  z  z  z  z, z, z

Upvotes: 2

Related Questions