Unable to get correct link in BeautifulSoup

Question

I'm trying to parse a bit of HTML and I'd like to extract the link that matches a particular pattern. I'm using the find method with a regular expression but it doesn't get me the correct link. Here's my snippet. Could someone tell me what I'm doing wrong?

from BeautifulSoup import BeautifulSoup
import re

html = """

    RT
    Trailer – 
    IMDB – 

"""

soup = BeautifulSoup(html)
print soup.find('a', href = re.compile(r".*title/tt.*"))['href']

I should be getting the second link but BS always returns the first link. The href of the first link doesn't even match my regex so why does it return it?

Thanks.

Katriel · Accepted Answer

find only returns the first tag. You want findAll.

Unable to get correct link in BeautifulSoup

Answers (2)

Related Questions