Reputation: 43
import csv
import pandas as pd
db = input("Enter the dataset name:")
table = db+".csv"
df = pd.read_csv(table)
df = df.sample(frac=1).reset_index(drop=True)
with open(table,'rb') as f:
data = csv.reader(f)
for row in data:
rows = row
break
print(rows)
I am trying to read all the columns from the csv file.
ERROR: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 15: invalid start byte
Upvotes: 4
Views: 8760
Reputation: 5785
You need to check encoding of your csv
file.
For that you can use print(f)
,
with open('file_name.csv') as f:
print(f)
The output will be:
<_io.TextIOWrapper name='file_name.csv' mode='r' encoding='utf8'>
Open csv
with the encoding as mentioned in the above output,
with open(fname, "rt", encoding="utf8") as f:
As mentioned in comments,
your encoding is cp1252
so,
with open(fname, "rt", encoding="cp1252") as f:
...
and for .read_csv
,
df = pd.read_csv(table, encoding='cp1252')
Upvotes: 5