McPedr0
McPedr0

Reputation: 571

return data from HTMLParser handle_starttag

My question is a simpler version of this

I have a youtube iframe:

<iframe width="560" height="315" src="//www.youtube.com/embed/fY9UhIxitYM" frameborder="0" allowfullscreen></iframe>

I'm working on a small web app and need to extract the random code (fY9UhIxitYM in this case). I want to use the standard library rather than importing Beautiful Soup.

from HTMLParser import HTMLParser

class YoutubeLinkParser(HTMLParser):
    def __init__(self):
        HTMLParser.__init__(self)
        self.data = []

    def handle_starttag(self, tag, attrs):
        data = attrs[2][1].split('/')[-1]
        self.data.append(data)

iframe = open('iframe.html').read()
parser = YoutubeLinkParser()
linkCode = parser.feed(iframe)

The examples I have found use handle_data(self, data), but I need information on an attr of the open tag. I can print the value in the method, but when I try to get a return value, linkCode returns 'none'.

What am I missing? Thanks!

Upvotes: 5

Views: 6274

Answers (1)

alecxe
alecxe

Reputation: 474171

feed() method doesn't return anything - which is why you are getting None. Instead, read the value of data property after calling feed():

from HTMLParser import HTMLParser

class YoutubeLinkParser(HTMLParser):
    def handle_starttag(self, tag, attrs):
        self.data = attrs[2][1].split('/')[-1]

iframe = open('iframe.html').read()
parser = YoutubeLinkParser()
parser.feed(iframe)
print parser.data

Prints:

fY9UhIxitYM

Upvotes: 7

Related Questions