Reputation: 3
I am trying to read data from a csv-file using Python with the following code:
with open("test.csv", 'r') as csv_data:
csv_reader = csv.reader(csv_data, delimiter=',', quotechar='"')
for row in csv_reader:
print(row)
print(row[0])
Here is my csv-file: https://drive.google.com/open?id=1KaKcSz_6-huVJvPffHAJIykuxu6BSgKK
As specified in the header, I would like the first number to be the row-number, the text in the middle is a movie review and the number at the end is the polarity of this review. My problem is that somehow the delimiter is not recognized and the whole line in the csv-file is not separated into three columns. Here is my output:
If more information is needed let me know. Any help is highly appreciated.
Upvotes: 0
Views: 67
Reputation: 149145
You should fix the way the csv file is produced. Currently is contains:
row_number,text,polarity
"""0"",""Bromwell High cartoon comedy. It ran time programs school life, """"Teachers"""". My 35 years teaching profession lead believe Bromwell High's satire much closer reality """"Teachers"""". The scramble survive financially, insightful students see right pathetic teachers' pomp, pettiness whole situation, remind schools I knew students. When I saw episode student repeatedly tried burn school, I immediately recalled ......... .......... High. A classic line: INSPECTOR: I'm sack one teachers. STUDENT: Welcome Bromwell High. I expect many adults age think Bromwell High far fetched. What pity isn't!"",""1"""
The header line is fine, but the data line is awful. First, it has additional quote as first and last characters, then all quotes are doubled. You must first preprocess the file:
with open("test.csv", 'r') as fd, open("test2.csv", 'w', newline='\r\n') as out:
for line in fd:
if line.startswith('"'):
line = line.strip()[1:-1].replace('""', '"')
print(line, file=out)
else:
_ = out.write(line)
The test2.csv
file should now be correct...
Upvotes: 1