stwhite
stwhite

Reputation: 3275

Get immediate parent tag with BeautifulSoup in Python

I've researched this question but haven't seen an actual solution to solving this. I'm using BeautifulSoup with Python and what I'm looking to do is get all image tags from a page, loop through each and check each to see if it's immediate parent is an anchor tag.

Here's some pseudo code:

html = BeautifulSoup(responseHtml)

for image in html.findAll('img'):
    if (image.parent.name == 'a'):
         image.hasParent = image.parent.link

Any ideas on this?

Upvotes: 25

Views: 41005

Answers (1)

alecxe
alecxe

Reputation: 473763

You need to check parent's name:

for img in soup.find_all('img'):
    if img.parent.name == 'a':
        print "Parent is a link"

Demo:

>>> from bs4 import BeautifulSoup
>>> 
>>> data = """
... <body>
...     <a href="google.com"><img src="image.png"/></a>
... </body>
... """
>>> soup = BeautifulSoup(data)
>>> img = soup.img
>>> 
>>> img.parent.name
a

You can also retrieve the img tags that have a direct a parent using a CSS selector:

soup.select('a > img')

Upvotes: 38

Related Questions