How to concatenate characters of certain columns of a dataframe?

Question

I have a dataframe that has domain names. But the problem is every character of the domain name is in single cell of a dataframe. Below is how it looks. the 'Column' is just column name for the first column.

testing = pd.DataFrame({'col':['h','h'],
                        'Unnamed :1':['t','t'],
                        'Unnamed :2':['t','t'],
                        'Unnamed :3':['p','p'],
                        'Unnamed :4':['s',':']})


print (testing)
  col Unnamed :1 Unnamed :2 Unnamed :3 Unnamed :4
0   h          t          t          p          s
1   h          t          t          p          :

I wish to concatenate every column and the resultant should look like

https
http:

My code : I read the excel sheet which has data, convert to dataframe and see if the first column of every row has one character or a string. If it is a character, I have to concatenate all the characters present in that entire row.

testing = pd.read_excel("path to .xlsx file")  
for i in range(len(testing)):      
    if len(testing.iloc[i,0]) == 1:
        testing.iloc[i,0] = testing.astype(str).values.sum(axis=1)

But this gives:

['https' 'http:' 'http:' 'http:' 'http:']

['https' 'http:' 'http:' 'http:' 'http:']

jezrael · Accepted Answer

Here loops are not necessary, assign to first column with iloc and : for all rows:

testing = pd.read_excel("path to .xlsx file")  
testing.iloc[:, 0] = testing.astype(str).values.sum(axis=1)
print (testing)
     col Unnamed :1 Unnamed :2 Unnamed :3 Unnamed :4
0  https          t          t          p          s
1  http:          t          t          p          :

EDIT: If need test first column for length first select by DataFrame.iloc and then test by Series.str.len, last set by values by DataFrame.where to empty strings:

testing = pd.DataFrame({'col':['something','h'],
                        'Unnamed :1':['t','t'],
                        'Unnamed :2':['t','t'],
                        'Unnamed :3':['p','p'],
                        'Unnamed :4':['s',':']})

mask = testing.iloc[:, 0].str.len() == 1
testing.iloc[:, 0] = testing.astype(str).where(mask, '').values.sum(axis=1)
print (testing)
     col Unnamed :1 Unnamed :2 Unnamed :3 Unnamed :4
0                 t          t          p          s
1  http:          t          t          p          :

How to concatenate characters of certain columns of a dataframe?

Answers (2)

Related Questions