Reputation: 117
I want to concatenate set of two similar named columns of the dataset. The columns are something like below:
URO_Brand1_Target,URO_Brand1,URO_Brand2_Target,URO_Brand2,URO_Brand3_Target
These could come in the dataset in no particular order. Now based on the condition if column name "URO_Brand1" is contained in column name "URO_Brand1_Target", I have to concatenate the two columns. And this I have to do for all set of similar columns.
Something like this:
URO_Brand1_Target URO_Brand1 Concatenate(URO_Brand1, URO_Brand1_Target)
95% CIG0002069 CIG0002069,95%
Upvotes: 0
Views: 289
Reputation: 29742
Use pandas.DataFrame.groupby
. Assume you have a df
:
URO_Brand1_Target URO_Brand1 URO_Brand2_Target URO_Brand2
0 95% something1 90% something2
Iterate through groups using groupby
:
for k, d in df.groupby(df.columns.str[:10], axis=1):
tmp = d.sort_index(1)
df['Concatenate(%s)' % ', '.join(d.columns)] = tmp.apply(','.join, 1)
Output:
URO_Brand1_Target URO_Brand1 URO_Brand2_Target URO_Brand2 \
0 95% something1 90% something2
Concatenate(URO_Brand1_Target, URO_Brand1) \
0 something1,95%
Concatenate(URO_Brand2_Target, URO_Brand2)
0 something2,90%
Upvotes: 1