Reputation: 5547
<td style="text-align: center;"><a title="Some title" href="https://www.blabla.com">Testing</a></td>
I'm trying to use BeautifulSoup
to get all the href
of a
tags which are a child of a td
tag.
I can run
urls = [x for x in soup.findAll("td")]
to obtain all the td
tags and then loop through them manually to see if they contain an a
tag and if so extract the href
, but is there a cleaner way of doing this in one line?
Upvotes: 0
Views: 2112
Reputation: 20008
Try using the :has()
CSS Selector to select all td
tags that have an <a>
tag.
from bs4 import BeautifulSoup
html = """<td style="text-align: center;"><a title="Some title" href="https://www.blabla.com">Testing</a></td>"""
soup = BeautifulSoup(html, "html.parser")
print([tag.find("a")["href"] for tag in soup.select("td:has(a)")])
Output:
['https://www.blabla.com']
Upvotes: 2