Erik_s
Erik_s

Reputation: 3

Reading Data from csv-file in Python

I am trying to read data from a csv-file using Python with the following code:

with open("test.csv", 'r') as csv_data:
        csv_reader = csv.reader(csv_data, delimiter=',', quotechar='"')
        for row in csv_reader:
            print(row)
            print(row[0])

Here is my csv-file: https://drive.google.com/open?id=1KaKcSz_6-huVJvPffHAJIykuxu6BSgKK

As specified in the header, I would like the first number to be the row-number, the text in the middle is a movie review and the number at the end is the polarity of this review. My problem is that somehow the delimiter is not recognized and the whole line in the csv-file is not separated into three columns. Here is my output:

code output

If more information is needed let me know. Any help is highly appreciated.

Upvotes: 0

Views: 67

Answers (1)

Serge Ballesta
Serge Ballesta

Reputation: 149145

You should fix the way the csv file is produced. Currently is contains:

row_number,text,polarity
"""0"",""Bromwell High cartoon comedy. It ran time programs school life, """"Teachers"""". My 35 years teaching profession lead believe Bromwell High's satire much closer reality """"Teachers"""". The scramble survive financially, insightful students see right pathetic teachers' pomp, pettiness whole situation, remind schools I knew students. When I saw episode student repeatedly tried burn school, I immediately recalled ......... .......... High. A classic line: INSPECTOR: I'm sack one teachers. STUDENT: Welcome Bromwell High. I expect many adults age think Bromwell High far fetched. What pity isn't!"",""1"""

The header line is fine, but the data line is awful. First, it has additional quote as first and last characters, then all quotes are doubled. You must first preprocess the file:

with open("test.csv", 'r') as fd, open("test2.csv", 'w', newline='\r\n') as out:
    for line in fd:
        if line.startswith('"'):
            line = line.strip()[1:-1].replace('""', '"')
            print(line, file=out)
        else:
            _ = out.write(line)

The test2.csv file should now be correct...

Upvotes: 1

Related Questions