Reputation: 57
I need an application to work on CSV files where I need to split one single row into multiple rows based on multiple value in any particular column. It would be better if anyone can help me with python code but any other solution is welcomed. Please help me to solve this. Thanks
Here is the input file: [![enter image description here][1]][1]
`
And I want to transfer this file into like below: enter code here
If I open the csv file in Notepad++, I can see these as below:
`Name,ID Club,Position,Inactive,Attendance Date
Arnold Simon,xxxxx,Volleyball - Women's,Player,No,"04/07/2021 04/05/2021"
Arnold Simon,xxxxx,Volleyball - Women's,Safety Officer,No,
Mike Anderson,yyyyy,Volleyball - Women's,Player,No,"04/07/2021 04/05/2021"
Thomas Engle,AAAAA,Volleyball - Women's,Player,No,4/5/2021
Chris Haris,BBBBB,Baseball,Player,No,"07/12/2021 07/11/2021 06/29/2021 06/25/2021 06/16/2021"`
Upvotes: 1
Views: 2945
Reputation: 4674
You could do this short one:
import csv
with open('input_file.csv') as file:
new = [row[:-1] + [i] for row in csv.reader(file) for i in row[-1].split()]
with open('output_file.csv', 'w') as file:
csv.writer(file).writerows(new)
Or this equivalent longer one:
import csv
with open('input_file.csv') as f_in, open('output_file.csv', 'w') as f_out:
csv_out = csv.writer(f_out)
for row in csv.reader(f_in):
for i in row[-1].split():
csv_out.writerow(row[:-1] + [i])
Upvotes: 1
Reputation: 57
I got it done with this below 1 line code!
`import pandas as pd
df = pd.read_csv('testfile.csv')
res = df.set_index(['Name', 'ID', 'Club','Position','Inactive']).apply(lambda x: x.str.split('\n').explode()).reset_index()
res.to_csv('output.csv')`
Upvotes: 1