Carlotta
Carlotta

Reputation: 1

How to write a Python script to open a csv file correctly

I have a csv file I cannot read properly because instead of it being comma-separated it has semicolons, therefore I cannot read it as a table.

Do you know if I can write a script in order to see it properly? Below I typed how I am reading part of the file.

;"sid";"aid";"sentnr";"parnr";"sentence";"Subject.party";                                               
1;43160789;74861000;1;1;"Officieel „aanzoek"" namens                                                  
2;43160790;74861000;1;2;"Van onze parlementaire redactie  NA;NA;NA;NA;NA;NA;NA                                      
3;43160791;74861000;2;2;"Hierdoor is de opvolging van                                                   
4;43160792;74861000;3;2;"Dr. Samkalden had in ;NA;NA;NA;NA;NA;NA;NA                                             
5;43160793;74861000;4;2;"In het kabinet-Bi                                  
6;43160794;74861000;5;2;"_";NA;NA;NA;NA;NA;NA;NA

Upvotes: 0

Views: 105

Answers (2)

mhawke
mhawke

Reputation: 87054

Use the delimiter argument to csv.reader();

import csv

with open('your_file.csv') as f:
    reader = csv.reader(f, delimiter=';')
    _ = next(reader)    # skip header row
    for row in reader:
        print row

Output

['1', '43160789', '74861000', '1', '1', 'Officieel \xc3\xa2\xe2\x82\xac\xc5\xbeaanzoek" namens\n2;43160790;74861000;1;2;Van onze parlementaire redactie  NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA']
['3', '43160791', '74861000', '2', '2', 'Hierdoor is de opvolging van\n4;43160792;74861000;3;2;Dr. Samkalden had in ', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA']
['5', '43160793', '74861000', '4', '2', 'In het kabinet-Bi\n6;43160794;74861000;5;2;_"', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA', 'NA']

This code will split fields on the semicolon as required, however, as pointed out by EdChum, there are other problems with the file, notably the use of unbalanced quotes.

Upvotes: 1

Nebril
Nebril

Reputation: 3273

I recommend using csv module.

import csv

with open('file.csv', 'r') as f:
    reader = csv.reader(f, delimiter=';')
    data = list(reader)

Upvotes: 1

Related Questions