pavster
pavster

Reputation: 13

Python: Append 2 columns of a dataframe together

I am loading a csv file into a data frame using pandas.

My dataframe looks something like this:

col1       col2       col3         
1           4           1 
2           5           2
3           6           3

I wish to append 2 of the columns into a new column:

  col1       col2        col3       col4   
    1           4           1         1
    2           5           2         2
    3           6           3         3
                                      4
                                      5 
                                      6

col4 needs to be created by appending the contents of col1 and col2 together.

How can I do this in pandas/python?

EDIT

df = df.reset_index(drop=True)

s = df['full_name'].append(df['alt_name'], ignore_index=True).rename('combined_names')
df = df.join(s, how='outer')

df = df.reset_index(drop=True)

s = df['full_address'].append(df['alt_add'], ignore_index=True).rename('combined_address')
df = df.join(s, how='outer')

Upvotes: 1

Views: 1042

Answers (1)

jezrael
jezrael

Reputation: 863531

First use Series.append or concat with rename for new Series and then add to original by DataFrame.join or concat:

s = df['col1'].append(df['col2'], ignore_index=True).rename('col4')
#alternative
#s = pd.concat([df['col1'], df['col2']], ignore_index=True).rename('col4')

df1 = df.join(s, how='outer')
#alternative
#df1 = pd.concat([df, s], axis=1)

print (df1)

   col1  col2  col3  col4
0   1.0   4.0   1.0     1
1   2.0   5.0   2.0     2
2   3.0   6.0   3.0     3
3   NaN   NaN   NaN     4
4   NaN   NaN   NaN     5
5   NaN   NaN   NaN     6

Last for avoid converting to floats is possible use:

df1 = df1.astype('Int64')
print (df1)
   col1  col2  col3  col4
0     1     4     1     1
1     2     5     2     2
2     3     6     3     3
3  <NA>  <NA>  <NA>     4
4  <NA>  <NA>  <NA>     5
5  <NA>  <NA>  <NA>     6

Or convert missing values to empty strings (what should be problem if need processing df later by some numeric method):

df1 = df1.fillna('')
print (df1)

  col1 col2 col3  col4
0    1    4    1     1
1    2    5    2     2
2    3    6    3     3
3                    4
4                    5
5                    6

EDIT:

df = df.reset_index(drop=True)

s1 = df['full_name'].append(df['alt_name'], ignore_index=True).rename('combined_names')
s2 = df['full_address'].append(df['alt_add'], ignore_index=True).rename('combined_address')

df1 = pd.concat([df, s1, s2], axis=1)

Upvotes: 1

Related Questions