Reputation: 21
I have tried this, but I get this error: 'UnicodeEncodeError: 'charmap' codec can't encode characters in position 286381-286385: character maps to ' import requests from bs4 import BeautifulSoup
def main():
f = open("sites.text", 'w')
page = requests.get("https://stackoverflow.com")
soup = BeautifulSoup(page.content, "html.parser")
f.write(str(soup))
f.close()
if __name__ == '__main__':
main()
Upvotes: 1
Views: 922
Reputation: 105
Try this:
with open("example", "w", encoding="utf8") as f:
page = requests.get("https://www.google.com/").text
soup = BeautifulSoup(page, "lxml")
f.write(str(soup))
You need to encode (also possible with soup.encode("utf-8")) and use text attribute of response object.
response.text returns the content of the response, in unicode.
Upvotes: 1