Merge duplicate rows in a text file using python based on a key column

I have a csv file and I need to merge records of those rows based on a key column name

a.csv

Name|Acc#|ID|Age
Suresh|2345|a-b2|24
Mahesh|234|a-vf|34
Mahesh|4554|a-bg|45
Keren|344|s-bg|45
yankie|999|z-bg|34
yankie|3453|g-bgbbg|45

Expected output: Merging records based on name like values from both the rows for name Mahesh and yankie are merged

Name|Acc#|ID|Age
Suresh|2345|a-b2|24
Mahesh|[234,4555]|[a-vf,a-bg]|[34,45]
Keren|344|s-bg|45
yankie|[999,3453]|[z-bg,g-bgbbg]|[34,45]

can someone help me with this in python?

Upvotes: 0

Views: 188

Answers (1)

bigbounty
bigbounty

Reputation: 17358

import pandas as pd

df = pd.read_csv("a.csv", sep="|", dtype=str)
new_df = df.groupby('Name',as_index=False).aggregate(lambda tdf: tdf.unique().tolist() if tdf.shape[0] > 1 else tdf)
new_df.to_csv("data.csv", index=False, sep="|")

Output:

Name|Acc#|ID|Age
Keren|344|s-bg|45
Mahesh|['234', '4554']|['a-vf', 'a-bg']|['34', '45']
Suresh|2345|a-b2|24
yankie|['999', '3453']|['z-bg', 'g-bgbbg']|['34', '45']

Upvotes: 2

Related Questions