STORM
STORM

Reputation: 4331

Beautiful Soup get nested span by class within another span

Within a very large HTML page i want to get a span by class which is unique. The child span of this one, can be queried also by class but which is not unique.

...
<span class="uniqueParent">
   <span class="notUniqueChildClassName">
      I am the child
   </span>
</span> 
...

Output should be "I am the child".

I have tried:

s = soup.select('span[class="uniqueParent"] > span[class="notUniqueChildClassName"]')
s.text

and

s = soup.find('span[class="uniqueParent"] > span[class="notUniqueChildClassName"]')
s.text

But both did not work.

Upvotes: 1

Views: 118

Answers (2)

Andrej Kesely
Andrej Kesely

Reputation: 195448

You can use CSS selector with dot (e.g .uniqueParent, instead of class="uniqueParent"):

from bs4 import BeautifulSoup


html_doc = """\
<span class="uniqueParent">
   <span class="notUniqueChildClassName">
      I am the child
   </span>
</span> """


soup = BeautifulSoup(html_doc, "html.parser")

print(soup.select_one(".uniqueParent .notUniqueChildClassName").text)

Prints:


      I am the child
   

Upvotes: 1

Jack Fleeting
Jack Fleeting

Reputation: 24930

Try changing the first attempt to

soup.select_one('span[class="uniqueParent"] > span[class="notUniqueChildClassName"]').text.strip()

on your actual html.

The output should be what you're looking for.

Upvotes: 1

Related Questions