Juho M
Juho M

Reputation: 347

UnicodeDecodeError when making pandas dataframe

I'm trying to make pandas dataframe using my CSV file.

Here is my code:

import requests, re, pandas, csv
from bs4 import BeautifulSoup
from io import StringIO

base_url="http://www.hltv.org/?pageid=188&statsfilter=2816&offset="
with open('cs_data1.csv', 'w', newline='') as out_file:
    for page in range(0,1200,50):
        r=requests.get(base_url+str(page))
        c=r.content

        table=BeautifulSoup(c,"html.parser")
        for row in table.find_all('div', style=re.compile(r'width:606px;height:22px;background-color')):
            buffer=StringIO(row.get_text(strip=True, separator=','))
            reader=csv.reader(buffer, skipinitialspace=True)        
            writer=csv.writer(out_file)
            writer.writerows(reader)

That code makes the CSV file and it works fine. Then I try to make pandas dataframe:

df=pandas.read_csv("cs_data1.csv")
df

And there I got the error: "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 22: invalid start byte".

What I should try to encode/decode so the dataframe would work?

Upvotes: 0

Views: 540

Answers (1)

zipa
zipa

Reputation: 27889

Did you try:

df = pandas.read_csv("cs_data1.csv", encoding='utf-8')

Upvotes: 1

Related Questions