Torched90
Torched90

Reputation: 305

Python 3 - HTML Parser - Empty Attributes

def handle_starttag(self, tag, attrs):
    print(attrs)

[]

How come my attrs is an empty list? Where is the data inside of the tags? I don't know why my attrs is empty, and I need the data from it, either from handle_data or from the attrs

import urllib.request
from html.parser import HTMLParser
import sys

class myHTMLParser(HTMLParser):
    
    def __init__(self):
        HTMLParser.__init__(self)
        self.country = {}
        
    def handle_starttag(self, tag, attrs):
        if tag == 'currency_name':
            self.country[self.handle_data] = tag
        print(self.country)
        
    def handle_endtag(self, tag):
        pass
    
    def handle_data(self, data):
        return(data.strip())
    
def main():
    if len(sys.argv) > 1:
        link = sys.argv[1]
    else:   
        link = 'http://www.bankofcanada.ca/stats/assets/xml/noon-five-day.xml' 
        
        
    myparser = myHTMLParser()    
    file = open(link, 'r')
    html = file.read()
    myparser.feed(html)
    file.close()
main()

Upvotes: 0

Views: 315

Answers (1)

jcoppens
jcoppens

Reputation: 5440

I think you are confused. At least the URL in your program does not have attributes, but it does have data. Attributes are the information which is inside the tags themselves. This is one way to transfer information.

In the case of your page, the information is between the start tag and the end tag.

Like <a href="mysite.org"></a> is one way of transferring the info.

 <p>this is text</p>

is another.

As there are no attributes, that list is empty. The data is in the results returned by handle_data.

Upvotes: 1

Related Questions