Reputation: 1
I have a csv file I cannot read properly because instead of it being comma-separated it has semicolons, therefore I cannot read it as a table.
Do you know if I can write a script in order to see it properly? Below I typed how I am reading part of the file.
;"sid";"aid";"sentnr";"parnr";"sentence";"Subject.party";
1;43160789;74861000;1;1;"Officieel „aanzoek"" namens
2;43160790;74861000;1;2;"Van onze parlementaire redactie NA;NA;NA;NA;NA;NA;NA
3;43160791;74861000;2;2;"Hierdoor is de opvolging van
4;43160792;74861000;3;2;"Dr. Samkalden had in ;NA;NA;NA;NA;NA;NA;NA
5;43160793;74861000;4;2;"In het kabinet-Bi
6;43160794;74861000;5;2;"_";NA;NA;NA;NA;NA;NA;NA
Upvotes: 0
Views: 105
Reputation: 87054
Use the delimiter
argument to csv.reader()
;
import csv
with open('your_file.csv') as f:
reader = csv.reader(f, delimiter=';')
_ = next(reader) # skip header row
for row in reader:
print row
Output
['1', '43160789', '74861000', '1', '1', 'Officieel \xc3\xa2\xe2\x82\xac\xc5\xbeaanzoek" namens\n2;43160790;74861000;1;2;Van onze parlementaire redactie NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA'] ['3', '43160791', '74861000', '2', '2', 'Hierdoor is de opvolging van\n4;43160792;74861000;3;2;Dr. Samkalden had in ', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA'] ['5', '43160793', '74861000', '4', '2', 'In het kabinet-Bi\n6;43160794;74861000;5;2;_"', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA']
This code will split fields on the semicolon as required, however, as pointed out by EdChum, there are other problems with the file, notably the use of unbalanced quotes.
Upvotes: 1
Reputation: 3273
I recommend using csv
module.
import csv
with open('file.csv', 'r') as f:
reader = csv.reader(f, delimiter=';')
data = list(reader)
Upvotes: 1