HTML output gets distorted after parsing it with BeautifulSoup

Question

I am tyring to parse an HTML page using BeautifulSoup. I've seen that once I did the parsing I get distortion in the output HTML file. The strange thing is that it exactly contains the same HTML (parsed with BeautifulSoup) as in the source file. Following is the code snippet I am using to achieve this:

output_pages = []
soup = BeautifulSoup(open(html_page, "r"), "lxml")
output_pages.append(soup.prettify())

with open(output_file, "w+") as f:
    for html_page in output_pages:
        f.write(html_page)

I tried some of its variants by using different arguments but none of them worked. Am I doing something wrong here or Is there any better way to parse HTML in python?

HTML output gets distorted after parsing it with BeautifulSoup

Answers (1)

Related Questions