Jeff
Jeff

Reputation: 227

BeautifulSoup "fixes" html on Windows 8, but not OS X or Windows 7

BeautifulSoup is giving me different results on different platforms. Can anyone help me understand why?

Create a test.txt file with the following contents:

<S1>
</S1>

When I run the following snipppet

from bs4 import BeautifulSoup
with open("test.txt", 'r') as f:
    lines = f.read()
soup = BeautifulSoup(lines)
print soup

On Windows 7 and on Mac OS X, it gives the result:

<S1>
</S1>

But on Windows 8, it changes it into a HTML document:

<html><head></head><body>
<S1>
</S1>
</body></html>

I know that BeautifulSoup will attempt to fix malformed HTML, but why are the results different here? Why doesn't it always fix it?

Note that on all 3 platforms the same versions of Python and BeautifulSoup were used (2.7.5 and 4.1.3).

Upvotes: 0

Views: 167

Answers (1)

Sabuj Hassan
Sabuj Hassan

Reputation: 39365

Beautifulsoup always uses the best available parser you have installed. I believe you have a different parser installed at your windows 8 compare to your other two OS.

From the BeautifulSoup:

If you don’t specify anything, you’ll get the best HTML parser that’s installed. Beautiful Soup ranks lxml’s parser as being the best, then html5lib’s, then Python’s built-in parser.

Upvotes: 1

Related Questions