Reputation: 13
I am loading a csv file into a data frame using pandas.
My dataframe looks something like this:
col1 col2 col3
1 4 1
2 5 2
3 6 3
I wish to append 2 of the columns into a new column:
col1 col2 col3 col4
1 4 1 1
2 5 2 2
3 6 3 3
4
5
6
col4 needs to be created by appending the contents of col1 and col2 together.
How can I do this in pandas/python?
EDIT
df = df.reset_index(drop=True)
s = df['full_name'].append(df['alt_name'], ignore_index=True).rename('combined_names')
df = df.join(s, how='outer')
df = df.reset_index(drop=True)
s = df['full_address'].append(df['alt_add'], ignore_index=True).rename('combined_address')
df = df.join(s, how='outer')
Upvotes: 1
Views: 1042
Reputation: 863531
First use Series.append
or concat
with rename
for new Series
and then add to original by DataFrame.join
or concat
:
s = df['col1'].append(df['col2'], ignore_index=True).rename('col4')
#alternative
#s = pd.concat([df['col1'], df['col2']], ignore_index=True).rename('col4')
df1 = df.join(s, how='outer')
#alternative
#df1 = pd.concat([df, s], axis=1)
print (df1)
col1 col2 col3 col4
0 1.0 4.0 1.0 1
1 2.0 5.0 2.0 2
2 3.0 6.0 3.0 3
3 NaN NaN NaN 4
4 NaN NaN NaN 5
5 NaN NaN NaN 6
Last for avoid converting to floats is possible use:
df1 = df1.astype('Int64')
print (df1)
col1 col2 col3 col4
0 1 4 1 1
1 2 5 2 2
2 3 6 3 3
3 <NA> <NA> <NA> 4
4 <NA> <NA> <NA> 5
5 <NA> <NA> <NA> 6
Or convert missing values to empty strings (what should be problem if need processing df later by some numeric method):
df1 = df1.fillna('')
print (df1)
col1 col2 col3 col4
0 1 4 1 1
1 2 5 2 2
2 3 6 3 3
3 4
4 5
5 6
EDIT:
df = df.reset_index(drop=True)
s1 = df['full_name'].append(df['alt_name'], ignore_index=True).rename('combined_names')
s2 = df['full_address'].append(df['alt_add'], ignore_index=True).rename('combined_address')
df1 = pd.concat([df, s1, s2], axis=1)
Upvotes: 1