sudonym
sudonym

Reputation: 4018

How to concatenate all (string) values in a given pandas dataframe row to one string?

I have a pandas dataframe that looks like this:

     0        1            2        3       4
0    I        want         to       join    strings  
1    But      only         in       row     1

The desired output should look like this:

     0        1      2        3       4       5
1    But      only   in       row     1       I want to join strings

How to concatenate those strings to a joint string?

Upvotes: 3

Views: 2920

Answers (4)

Chris Harris
Chris Harris

Reputation: 402

If your dataset is less than perfect and you want to exclude 'nan' values you can use this:

df.apply(lambda x :' '.join(x for x in x.astype(str) if x != "nan"),1)

I found this particularly helpful in joining columns containing parts of addresses together where some parts like SubLocation (e.g. apartment #) aren't relevant for all addresses.

Upvotes: 0

niraj
niraj

Reputation: 18208

One other alternative way can be with add space followed by sum:

df[5] = df.add(' ').sum(axis=1).shift(1)

Result:

     0     1   2     3        4                       5
0    I  want  to  join  strings                     NaN
1  But  only  in   row        1  I want to join strings 

Upvotes: 1

cs95
cs95

Reputation: 402553

Use str.cat to join the first row, and assign to the second.

i = df.iloc[1:].copy()   # the copy is needed to prevent chained assignment
i[df.shape[1]] = df.iloc[0].str.cat(sep=' ')

i     
     0     1   2    3  4                       5
1  But  only  in  row  1  I want to join strings

Upvotes: 2

BENY
BENY

Reputation: 323276

IIUC, by using apply , join

df.apply(lambda x :' '.join(x.astype(str)),1)
Out[348]: 
0    I want to join strings
1         But only in row 1
dtype: object

Then you can assign them

df1=df.iloc[1:]
df1['5']=df.apply(lambda x :' '.join(x.astype(str)),1)[0]
df1
Out[361]: 
     0     1   2    3  4                       5
1  But  only  in  row  1  I want to join strings

For Timing :

%timeit df.apply(lambda x : x.str.cat(),1)
1 loop, best of 3: 759 ms per loop
%timeit df.apply(lambda x : ''.join(x),1)
1 loop, best of 3: 376 ms per loop


df.shape
Out[381]: (3000, 2000)

Upvotes: 4

Related Questions