Reputation: 317
I am trying to scrape a site that has the following div:
<div class="sr-2">
<span>Test</span>
<span>Outcome 2234</span>
</div>
How can I scrape the data from the second span? I have been playing around with the code below but not getting anywhere:
test_outcomes = container.find('div', class_='sr-2').text
Upvotes: 0
Views: 713
Reputation: 84465
I thought I would give Andrej a run for his money and add a few more options:
print(soup.select_one("div.sr-2 > span + span").text) # or:
print(soup.select_one("div.sr-2 > span ~ span").text) # or:
print(soup.select_one("div.sr-2 > span:nth-child(even)").text) # or:
print(soup.select_one("div.sr-2 > span:nth-child(n+2)").text) # or:
print(soup.select_one("div.sr-2 > span:last-child").text) # or:
print(soup.select_one("div.sr-2 > span:last-of-type").text) # or:
print(soup.select_one("div.sr-2 > span:nth-last-child(1)").text) # or:
print(soup.select_one("div.sr-2 > span").find_next('span').text) # or:
print(soup.select_one("div.sr-2 > span").find_next_sibling("span").text)
And there are yet more....... That is one very skinned cat (or well shaken carbuncle!).
Now, of course, the importance is to understand the differences between these and when, in reality, to use them. I would suggest reading the following:
Upvotes: 3
Reputation: 195438
You can use following example how to extract the text from second <span>
:
from bs4 import BeautifulSoup
html_doc = """
<div class="sr-2">
<span>Test</span>
<span>Outcome 2234</span>
</div>"""
soup = BeautifulSoup(html_doc, "html.parser")
print(soup.select_one("div.sr-2 > span:nth-of-type(2)").text) # or:
print(soup.select("div.sr-2 > span")[1].text) # or:
print(soup.find("div", class_="sr-2").find_all("span")[1].text)
Prints:
Outcome 2234
Outcome 2234
Outcome 2234
Upvotes: 4
Reputation: 753
You can use find_all and then slicing to get the second one.
container.find('div', class_='sr-2').find_all("span")[1].text
Upvotes: 1