Reputation: 305
def handle_starttag(self, tag, attrs):
print(attrs)
[]
How come my attrs is an empty list? Where is the data inside of the tags? I don't know why my attrs is empty, and I need the data from it, either from handle_data or from the attrs
import urllib.request
from html.parser import HTMLParser
import sys
class myHTMLParser(HTMLParser):
def __init__(self):
HTMLParser.__init__(self)
self.country = {}
def handle_starttag(self, tag, attrs):
if tag == 'currency_name':
self.country[self.handle_data] = tag
print(self.country)
def handle_endtag(self, tag):
pass
def handle_data(self, data):
return(data.strip())
def main():
if len(sys.argv) > 1:
link = sys.argv[1]
else:
link = 'http://www.bankofcanada.ca/stats/assets/xml/noon-five-day.xml'
myparser = myHTMLParser()
file = open(link, 'r')
html = file.read()
myparser.feed(html)
file.close()
main()
Upvotes: 0
Views: 315
Reputation: 5440
I think you are confused. At least the URL in your program does not have attributes, but it does have data. Attributes are the information which is inside the tags themselves. This is one way to transfer information.
In the case of your page, the information is between the start tag and the end tag.
Like <a href="mysite.org"></a>
is one way of transferring the info.
<p>this is text</p>
is another.
As there are no attributes, that list is empty. The data is in the results returned by handle_data.
Upvotes: 1