Reputation: 4331
Within a very large HTML page i want to get a span
by class
which is unique. The child span
of this one, can be queried also by class
but which is not unique.
...
<span class="uniqueParent">
<span class="notUniqueChildClassName">
I am the child
</span>
</span>
...
Output should be "I am the child".
I have tried:
s = soup.select('span[class="uniqueParent"] > span[class="notUniqueChildClassName"]')
s.text
and
s = soup.find('span[class="uniqueParent"] > span[class="notUniqueChildClassName"]')
s.text
But both did not work.
Upvotes: 1
Views: 118
Reputation: 195448
You can use CSS selector with dot (e.g .uniqueParent
, instead of class="uniqueParent"
):
from bs4 import BeautifulSoup
html_doc = """\
<span class="uniqueParent">
<span class="notUniqueChildClassName">
I am the child
</span>
</span> """
soup = BeautifulSoup(html_doc, "html.parser")
print(soup.select_one(".uniqueParent .notUniqueChildClassName").text)
Prints:
I am the child
Upvotes: 1
Reputation: 24930
Try changing the first attempt to
soup.select_one('span[class="uniqueParent"] > span[class="notUniqueChildClassName"]').text.strip()
on your actual html.
The output should be what you're looking for.
Upvotes: 1