Reputation: 19
I've been reviewing examples of how to read in HTML from websites using XPass and lxml. For some reason when I try with a local file I keep running into this error.
AttributeError: 'str' object has no attribute 'content'
This is the code
with open(r'H:\Python\Project\File','r') as f:
file = f.read()
f.close()
tree = html.fromstring(file.content)
Upvotes: 1
Views: 11601
Reputation: 2211
Try encoding='utf-8'
f1 = open(new_file + '.html', 'r', encoding="utf-8")
Upvotes: 0
Reputation: 36608
You have a few problems with your code. It looks like you are modifying code that is parsing html from an http/https request. In that case using .content()
extracts the bytes from the response object.
However, when reading from a file, you are already reading in the contents of the file in your with
context. Also, you don't need to use .close()
, the context manager takes care of that for you.
Try this:
with open(r'H:\Python\Project\File','r') as f:
tree = html.fromstring(f.read())
Upvotes: 1