Reputation: 11
I'm new in python and i 'm trying to sort and create csv file by the values of col 3 for which is row header
The csv as this following structure:
Name;Family;ID
Paul;Smith;5
Kery;Gou;6
Jimmy;Ja;2
Jony;Luo;5
Jack;Elve;2
The result i want to get is 3 different file (in this case) sort by the id
So file one Id5.csv should be like
Paul Smith 5
Jony Luo 5
File Id6.csv should be like
Kery Gou 6
And Id2.csv should look like
Jimmy Ja 2
Jack Elve 2
Hope i was clear, any help would be appreciate
Upvotes: 0
Views: 92
Reputation: 394459
This can achieved easily using pandas
library:
In [141]:
import pandas as pd
import io
#
t="""Name;Family;ID
Paul;Smith;5
Kery;Gou;6
Jimmy;Ja;2
Jony;Luo;5
Jack;Elve;2"""
#load the csv
df = pd.read_csv(io.StringIO(t), sep=';')
# now get unique IDs, construct a filename and write out
for ID in df['ID'].unique():
print('ID' + str(ID))
#df[df['ID']==ID].to_csv('ID' + str(ID) + '.csv')
ID5
ID6
ID2
You can ignore the io
bit above in your case it would just be:
df = pd.read_csv(file_path, sep=';')
So you'd just uncomment line:
df[df['ID']==ID].to_csv('ID' + str(ID) + '.csv')
you can optionally pass params index=False
and sep='\t'
if you don't want an index column and you prefer tab separated, see the docs
Upvotes: 2
Reputation: 402
How about this:
with open('your.csv') as f:
lines = [line.split(';') for line in f.read().splitlines()[1:]]
lines_grouped = [[l for l in lines if l[2]==x] for x in {l[2] for l in lines}]
for group in lines_grouped:
with open('Id' + group[0][2] + '.csv', 'w+') as f:
f.write('\n'.join([','.join(line) for line in group]))
Upvotes: 0