Reputation: 807
This question may be very basic, but I would like to concatenate three columns in a pandas DataFrame.
I would like to concatenate col1, col2 and col3 into col4. I know in R this could be done with the paste function quite easily.
df = pd.DataFrame({'col1': [2012, 2013, 2014], 'col2': 'q', 'col3': range(3)})
Edit: Code for clarity - I would like to generate col4 automatically:
x=pd.DataFrame()
x['col1'] = [2012,2013,2013]
x['col2'] = ['q', 'q', 'q']
x['col3'] = [1,2,3]
x['col4'] = ['2012q1', '2013q2', '2014q4']
Upvotes: 0
Views: 4705
Reputation: 19025
To concatenate across all columns, it may be more convenient to write df.apply(..., axis=1)
, as in:
df['col4'] = df.apply(lambda x: "".join(x.astype(str)),axis=1)
df
# col1 col2 col3 col4
#0 2012 q 1 2012q1
#1 2013 q 2 2013q2
#2 2014 q 3 2014q3
especially if you have a lot of columns and don't want to write them all out (as required by Kyle's answer).
Upvotes: 2
Reputation: 294576
Use pd.DataFrame.sum
with axis=1
after converting to strings.
I use pd.DataFrame.assign
to create a copy with the new column
df.assign(col4=df[['col1', 'col2', 'col3']].astype(str).sum(1))
col1 col2 col3 col4
0 2012 q 1 2012q1
1 2013 q 2 2013q2
2 2014 q 3 2014q3
Or you can add a column inplace
df['col4'] = df[['col1', 'col2', 'col3']].astype(str).sum(1)
df
col1 col2 col3 col4
0 2012 q 1 2012q1
1 2013 q 2 2013q2
2 2014 q 3 2014q3
If df
only has the three columns, you can reduce code to
df.assign(col4=df.astype(str).sum(1))
If df
has more than three columns but the three you want to concat are the first three
df.assign(col4=df.iloc[:, :3].astype(str).sum(1))
Upvotes: 4