Alk
Alk

Reputation: 5547

Extract all Links in Table with Beautiful Soup

<td style="text-align: center;"><a title="Some title" href="https://www.blabla.com">Testing</a></td>

I'm trying to use BeautifulSoup to get all the href of a tags which are a child of a td tag.

I can run

urls = [x for x in soup.findAll("td")]

to obtain all the td tags and then loop through them manually to see if they contain an a tag and if so extract the href, but is there a cleaner way of doing this in one line?

Upvotes: 0

Views: 2112

Answers (1)

MendelG
MendelG

Reputation: 20008

Try using the :has() CSS Selector to select all td tags that have an <a> tag.

from bs4 import BeautifulSoup

html = """<td style="text-align: center;"><a title="Some title" href="https://www.blabla.com">Testing</a></td>"""
soup = BeautifulSoup(html, "html.parser")
print([tag.find("a")["href"] for tag in soup.select("td:has(a)")])

Output:

['https://www.blabla.com']

Upvotes: 2

Related Questions