Reputation: 227
BeautifulSoup is giving me different results on different platforms. Can anyone help me understand why?
Create a test.txt
file with the following contents:
<S1>
</S1>
When I run the following snipppet
from bs4 import BeautifulSoup
with open("test.txt", 'r') as f:
lines = f.read()
soup = BeautifulSoup(lines)
print soup
On Windows 7 and on Mac OS X, it gives the result:
<S1>
</S1>
But on Windows 8, it changes it into a HTML document:
<html><head></head><body>
<S1>
</S1>
</body></html>
I know that BeautifulSoup will attempt to fix malformed HTML, but why are the results different here? Why doesn't it always fix it?
Note that on all 3 platforms the same versions of Python and BeautifulSoup were used (2.7.5 and 4.1.3).
Upvotes: 0
Views: 167
Reputation: 39365
Beautifulsoup always uses the best available parser you have installed. I believe you have a different parser installed at your windows 8 compare to your other two OS.
From the BeautifulSoup:
If you don’t specify anything, you’ll get the best HTML parser that’s installed. Beautiful Soup ranks lxml’s parser as being the best, then html5lib’s, then Python’s built-in parser.
Upvotes: 1