Reputation: 31
I'm making a python project in which I created a test wix website. I want to get the data (text) from the wix website using urllib so I did url.urlopen(ADDRESS).readlines() the problem is it did not give me anything from the text in the page and only information about the structure of the page in HTML. how would I extricate the requested text information from the website?
Upvotes: 1
Views: 818
Reputation: 688
I think you'll need to end up parsing the html for the information you want. Check out this python library:
https://docs.python.org/3/library/html.parser.html
You could potentially do something like this:
from html.parser import HTMLParser
rel_data = []
class MyHTMLParser(HTMLParser):
def handle_data(self, data):
rel_data.append(data)
parser = MyHTMLParser()
parser.feed('<html><head><title>Test</title></head>'
'<body><h1>Parse me!</h1></body></html>')
print(rel_data)
Output
["Test", "Parse me!"]
Upvotes: 1