How to merge multiple columns with same content in the excel output file using pandas

Question

i have a pandas dataframe like below table. for each SITEID in the first column, i've same value for other columns like Priority, Region and Vendor but not same in the History column.

SITEID  Priority    Region  Vendor                          HISTORY
======  =========   ======  ======= =================================================================
E1149       P3        R10     NSN       09-09 : ZRBSCN8, LUE1149 : Connector Faulty : 00: 31
=====================================================================================================
E1149       P3        R10     NSN       09-08 : ZRBSCN8, LUE1149 (Fluctuation)BSS Cabling Fault: 00: 16
=====================================================================================================
E1149       P3        R10     NSN       09-07 : ZRBSCN8, LUE1149 : BSS Cabling Fault : 01: 02
=====================================================================================================
E1150       P3        R10     E//       09-09 : BABSCE3, LUE1150 & LUT7695 : Unclear : 01: 13
=====================================================================================================
E1150       P3        R10     E//       09-08 : BABSCE3, E1150 & T7695 : Unclear : 00: 18
=====================================================================================================

at first i want to merge the first four columns (SITEID, Priority, Region and Vendor) per each siteID and then put all the relevant records in the History column against it like below:

SITEID  Priority    Region  Vendor                          HISTORY
======  =========   ======  ======= =================================================================
E1149       P3        R10     NSN       09-09 : ZRBSCN8, LUE1149 : Connector Faulty : 00: 31
                                        09-08 : ZRBSCN8, LUE1149 (Fluctuation)BSS Cabling Fault:00: 16
                                        09-07 : ZRBSCN8, LUE1149 : BSS Cabling Fault : 01: 02
=====================================================================================================
E1150       P3        R10     E//       09-09 : BABSCE3, LUE1150 & LUT7695 : Unclear : 01: 13
                                        09-08 : BABSCE3, E1150 & T7695 : Unclear : 00: 18
=====================================================================================================

what is the most efficient way to do this in the excel output file using xlswriter etc? i tried many solutions like swaplevel but no result.

Shubham Sharma · Accepted Answer

You can try a simple groupby and agg using .join with delimiter :

cols = ['SITEID', 'Priority', 'Region', 'Vendor']
df_merged = df.groupby(cols, as_index=False).agg('
'.join)

Then save this merged dataframe to excel as:

df_merged.to_excel('file.xlsx')

Result:

How to merge multiple columns with same content in the excel output file using pandas

Answers (2)

Related Questions