Reputation: 117
I'm trying to create an API in flask that will return the contents of any website the user mentions. Currently, it runs perfectly on normal websites, but when the URL contains accent characters I get the following error.
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 26: ordinal not in range(128)
I'm sure this is because of the accent characters. Following is the route
@app.route('/universal/<string:type_>/<path:site>/')
The function for sending the request
def get_soup(self):
req = urllib.request.Request(self.url, headers={'User-Agent' : "Mozilla/5.0 (Windows NT 6.1; Win64; x64)"})
page = urllib.request.urlopen(req)
soup = bs.BeautifulSoup(page.read(), self.parser)
return soup
Here, site contains the URL of the site. Is there any way I can make this run?
Upvotes: 0
Views: 652
Reputation: 1094
You need to encode the URL with e.g. urllib.parse.quote(url)
.
Change your code to
req = urllib.request.Request(urllib.parse.quote(self.url), headers={'User-Agent' : "Mozilla/5.0 (Windows NT 6.1; Win64; x64)"})
See https://www.urlencoder.io/python/ for more information and examples.
Upvotes: 1