Itay Klainer
Itay Klainer

Reputation: 21

How do I Copy the HTML code of a website to a text file in python?

I have tried this, but I get this error: 'UnicodeEncodeError: 'charmap' codec can't encode characters in position 286381-286385: character maps to ' import requests from bs4 import BeautifulSoup

def main():
    f = open("sites.text", 'w')
    page = requests.get("https://stackoverflow.com")
    soup = BeautifulSoup(page.content, "html.parser")
    f.write(str(soup))
    f.close()

if __name__ == '__main__':
    main()

Upvotes: 1

Views: 922

Answers (1)

mjeday
mjeday

Reputation: 105

Try this:

with open("example", "w", encoding="utf8") as f:
    page = requests.get("https://www.google.com/").text
    soup = BeautifulSoup(page, "lxml")
    f.write(str(soup))

You need to encode (also possible with soup.encode("utf-8")) and use text attribute of response object.
response.text returns the content of the response, in unicode.

Upvotes: 1

Related Questions