Reputation: 81
import pandas
df = pandas.read_csv("trial.csv")
The above code is used to read a simple csv file. But I keep getting the following error
File "C:\Users\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1748, in read
data = self._reader.read(nrows)
File "pandas\_libs\parsers.pyx", line 890, in pandas._libs.parsers.TextReader.read (pandas\_libs\parsers.c:10862)
File "pandas\_libs\parsers.pyx", line 912, in pandas._libs.parsers.TextReader._read_low_memory (pandas\_libs\parsers.c:11138)
File "pandas\_libs\parsers.pyx", line 989, in pandas._libs.parsers.TextReader._read_rows (pandas\_libs\parsers.c:12175)
File "pandas\_libs\parsers.pyx", line 1117, in pandas._libs.parsers.TextReader._convert_column_data (pandas\_libs\parsers.c:14136)
File "pandas\_libs\parsers.pyx", line 1169, in pandas._libs.parsers.TextReader._convert_tokens (pandas\_libs\parsers.c:14972)
File "pandas\_libs\parsers.pyx", line 1273, in pandas._libs.parsers.TextReader._convert_with_dtype (pandas\_libs\parsers.c:17119)
File "pandas\_libs\parsers.pyx", line 1289, in pandas._libs.parsers.TextReader._string_convert (pandas\_libs\parsers.c:17347)
File "pandas\_libs\parsers.pyx", line 1524, in pandas._libs.parsers._string_box_utf8 (pandas\_libs\parsers.c:23041)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe3 in position 43: invalid continuation byte
Upvotes: 5
Views: 30625
Reputation: 1
store=pd.read_csv('Super_Store.csv', encoding='windows-1252')
We just need to tell Python the actual encoding of this file. After some trail and error, I figured out that it was in windows-1252
encoding.
This is probably because these files were saved on a Windows computer at some point and this was the default character encoding for that computer.
For details go to :
HTML Windows-1252 (ANSI) Reference
Upvotes: 0
Reputation: 116
import pandas
df = pandas.read_csv("trial.csv", "rb")
if none of the suggestions above worked, "rb" read binary might do the trick
Upvotes: 2
Reputation: 128
Hi sorry I am so late to this, please change your code to the below and see if that works.
import pandas
df = pandas.read_csv("trial.csv", encoding="ISO-8859-1")
Upvotes: 8
Reputation: 11406
Your parser is trying to parse utf-8
data, but your file seems to be in another encoding (or there could just be an invalid character).
Try to instruct the parser to parse as plain ascii
, perhaps with some codepage (I don't know Python, so can't help with that).
Looks like you need to use the encoding
parameter.
Here is the list with possible encodings.
Upvotes: 1