Reputation: 19

Reading Local HTML File in Python

I've been reviewing examples of how to read in HTML from websites using XPass and lxml. For some reason when I try with a local file I keep running into this error.

AttributeError: 'str' object has no attribute 'content'

This is the code

with open(r'H:\Python\Project\File','r') as f:
    file = f.read()
f.close()

tree = html.fromstring(file.content)

Upvotes: 1

Answers (2)

johnashu

Reputation: 2211

Try encoding='utf-8'

f1 = open(new_file + '.html', 'r', encoding="utf-8")

Upvotes: 0

James

Reputation: 36608

You have a few problems with your code. It looks like you are modifying code that is parsing html from an http/https request. In that case using .content() extracts the bytes from the response object.

However, when reading from a file, you are already reading in the contents of the file in your with context. Also, you don't need to use .close(), the context manager takes care of that for you.

Try this:

with open(r'H:\Python\Project\File','r') as f:
    tree = html.fromstring(f.read())

Upvotes: 1

Reading Local HTML File in Python

Answers (2)

Related Questions