Zain Rk
Zain Rk

Reputation: 11

I am unable to understand the meaning of this line of code while working with Beautiful Soup module

def search(self, topic, site):
        bs = self.getPage(site.searchurl + topic)
        searchresults = bs.select(site.resultingList)
        for result in searchresults:
            url = result.select(site.resulturl)[0].attrs["href"]
            if (site.absoluteUrl):
                bs = self.getPage(url)
            else:
                bs = self.getPage(site.url + url)
            if bs is None:
                print("Something was wrong with that page or URL. Skipping!")
                return
            title = self.safeGet(bs, site.titleTag)
            body = self.safeGet(bs, site.bodyTag)
            if title != '' and body != '':
                content = Content(topic, title, body, url)
                content.print()

In this code what is the meaning of:

result.select(site.resulturl)[0].attrs["href"]

more specifically, i'm unable to understand attrs["href"]

Upvotes: 0

Views: 31

Answers (1)

goalie1998
goalie1998

Reputation: 1432

attrs["href"] pulls the "href" attribute from result.select(site.resulturl)[0]. So most likely site.resulturl contains at least 1 or more <a ... href="..."> tags (or any other tag that has an "href" attribute), and that line is pulling the link out of the first one in the list.

Upvotes: 1

Related Questions