Reputation: 53
I'm trying to automate an email sending service, which sends a person's bus station to his mail.
In order to do so I need to pull some data from a Hebrew website, but all I get is a file with gibberish in it.
I have tried encoding to utf8
, but all I get is more gibberish.
import requests
import pandas as pd
url = 'http://yit.maya-tour.co.il/yit-pass/Drop_Report.aspx?client_code=2660&coordinator_code=2669'
html = requests.get(url).content
df_list = pd.read_html(html)
df = df_list[-1]
print(df)
df.to_csv('my data.csv')
I expected for the following:
רשימת פיזורים
שם הנהג סוג הרכב הערות תאור שעה
מוניות הקניון מונית A35 פיזור-שדרות 06:30
but got:
×©× ×× ×× ×¡×× ×ר×× ... ת××ר שע×
0 ××× ××ת ××§× ××× ××× ×ת ... פ×××ר-ש×ר×ת 06:30
Upvotes: 3
Views: 340
Reputation: 2402
A response object's .content
property gives you the data in bytes, try doing .text
instead:
html = requests.get(url).text
More detail here: What is the difference between 'content' and 'text'
Upvotes: 2