Aditi
Aditi

Reputation: 117

Concatenate columns after matching column names

I want to concatenate set of two similar named columns of the dataset. The columns are something like below:

URO_Brand1_Target,URO_Brand1,URO_Brand2_Target,URO_Brand2,URO_Brand3_Target

These could come in the dataset in no particular order. Now based on the condition if column name "URO_Brand1" is contained in column name "URO_Brand1_Target", I have to concatenate the two columns. And this I have to do for all set of similar columns.

Something like this:

URO_Brand1_Target URO_Brand1   Concatenate(URO_Brand1, URO_Brand1_Target)
     95%      CIG0002069   CIG0002069,95%

Upvotes: 0

Views: 289

Answers (1)

Chris
Chris

Reputation: 29742

Use pandas.DataFrame.groupby. Assume you have a df:

  URO_Brand1_Target  URO_Brand1 URO_Brand2_Target  URO_Brand2
0               95%  something1               90%  something2

Iterate through groups using groupby:

for k, d in df.groupby(df.columns.str[:10], axis=1):
    tmp = d.sort_index(1)
    df['Concatenate(%s)' % ', '.join(d.columns)] = tmp.apply(','.join, 1)

Output:

  URO_Brand1_Target  URO_Brand1 URO_Brand2_Target  URO_Brand2  \
0               95%  something1               90%  something2   

  Concatenate(URO_Brand1_Target, URO_Brand1)  \
0                             something1,95%   

  Concatenate(URO_Brand2_Target, URO_Brand2)  
0                             something2,90%  

Upvotes: 1

Related Questions