Get immediate parent tag with BeautifulSoup in Python

stwhite picture stwhite · Jan 10, 2015 · Viewed 22.2k times · Source

I've researched this question but haven't seen an actual solution to solving this. I'm using BeautifulSoup with Python and what I'm looking to do is get all image tags from a page, loop through each and check each to see if it's immediate parent is an anchor tag.

Here's some pseudo code:

html = BeautifulSoup(responseHtml)

for image in html.findAll('img'):
    if (image.parent.name == 'a'):
         image.hasParent = image.parent.link

Any ideas on this?

Answer

alecxe picture alecxe · Jan 10, 2015

You need to check parent's name:

for img in soup.find_all('img'):
    if img.parent.name == 'a':
        print "Parent is a link"

Demo:

>>> from bs4 import BeautifulSoup
>>> 
>>> data = """
... <body>
...     <a href="google.com"><img src="image.png"/></a>
... </body>
... """
>>> soup = BeautifulSoup(data)
>>> img = soup.img
>>> 
>>> img.parent.name
a

You can also retrieve the img tags that have a direct a parent using a CSS selector:

soup.select('a > img')