anush95
anush95

Reputation: 81

Error while trying to use pandas to read a csv

import pandas
df = pandas.read_csv("trial.csv")

The above code is used to read a simple csv file. But I keep getting the following error

File "C:\Users\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1748, in read
    data = self._reader.read(nrows)
  File "pandas\_libs\parsers.pyx", line 890, in pandas._libs.parsers.TextReader.read (pandas\_libs\parsers.c:10862)
  File "pandas\_libs\parsers.pyx", line 912, in pandas._libs.parsers.TextReader._read_low_memory (pandas\_libs\parsers.c:11138)
  File "pandas\_libs\parsers.pyx", line 989, in pandas._libs.parsers.TextReader._read_rows (pandas\_libs\parsers.c:12175)
  File "pandas\_libs\parsers.pyx", line 1117, in pandas._libs.parsers.TextReader._convert_column_data (pandas\_libs\parsers.c:14136)
  File "pandas\_libs\parsers.pyx", line 1169, in pandas._libs.parsers.TextReader._convert_tokens (pandas\_libs\parsers.c:14972)
  File "pandas\_libs\parsers.pyx", line 1273, in pandas._libs.parsers.TextReader._convert_with_dtype (pandas\_libs\parsers.c:17119)
  File "pandas\_libs\parsers.pyx", line 1289, in pandas._libs.parsers.TextReader._string_convert (pandas\_libs\parsers.c:17347)
  File "pandas\_libs\parsers.pyx", line 1524, in pandas._libs.parsers._string_box_utf8 (pandas\_libs\parsers.c:23041)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe3 in position 43: invalid continuation byte

Upvotes: 5

Views: 30625

Answers (4)

Prateek Sharma
Prateek Sharma

Reputation: 1

store=pd.read_csv('Super_Store.csv', encoding='windows-1252') 

We just need to tell Python the actual encoding of this file. After some trail and error, I figured out that it was in windows-1252 encoding.

This is probably because these files were saved on a Windows computer at some point and this was the default character encoding for that computer. For details go to :
HTML Windows-1252 (ANSI) Reference

Upvotes: 0

warkitty
warkitty

Reputation: 116

import pandas
df = pandas.read_csv("trial.csv", "rb")

if none of the suggestions above worked, "rb" read binary might do the trick

Upvotes: 2

Dayantat
Dayantat

Reputation: 128

Hi sorry I am so late to this, please change your code to the below and see if that works.

import pandas
df = pandas.read_csv("trial.csv", encoding="ISO-8859-1")

Upvotes: 8

Danny_ds
Danny_ds

Reputation: 11406

Your parser is trying to parse utf-8 data, but your file seems to be in another encoding (or there could just be an invalid character).

Try to instruct the parser to parse as plain ascii, perhaps with some codepage (I don't know Python, so can't help with that).


Looks like you need to use the encoding parameter.

Here is the list with possible encodings.

Upvotes: 1

Related Questions