rob
rob

Reputation: 317

Getting Second Span using BeautifulSoup

I am trying to scrape a site that has the following div:

<div class="sr-2">
    <span>Test</span>
    <span>Outcome 2234</span>
</div>

How can I scrape the data from the second span? I have been playing around with the code below but not getting anywhere:

test_outcomes = container.find('div', class_='sr-2').text

Upvotes: 0

Views: 713

Answers (3)

QHarr
QHarr

Reputation: 84465

I thought I would give Andrej a run for his money and add a few more options:

print(soup.select_one("div.sr-2 > span + span").text)   # or:
print(soup.select_one("div.sr-2 > span ~ span").text)   # or:
print(soup.select_one("div.sr-2 > span:nth-child(even)").text)   # or:
print(soup.select_one("div.sr-2 > span:nth-child(n+2)").text)   # or:
print(soup.select_one("div.sr-2 > span:last-child").text)   # or:
print(soup.select_one("div.sr-2 > span:last-of-type").text)   # or:
print(soup.select_one("div.sr-2 > span:nth-last-child(1)").text)   # or:
print(soup.select_one("div.sr-2 > span").find_next('span').text) # or:
print(soup.select_one("div.sr-2 > span").find_next_sibling("span").text)

And there are yet more....... That is one very skinned cat (or well shaken carbuncle!).

Now, of course, the importance is to understand the differences between these and when, in reality, to use them. I would suggest reading the following:

  1. bs4 documentation
  2. soupsieve documentation
  3. MDN Web Docs - CSS selectors
  4. whatwg.org selectors

Upvotes: 3

Andrej Kesely
Andrej Kesely

Reputation: 195438

You can use following example how to extract the text from second <span>:

from bs4 import BeautifulSoup

html_doc = """
<div class="sr-2">
    <span>Test</span>
    <span>Outcome 2234</span>
</div>"""

soup = BeautifulSoup(html_doc, "html.parser")

print(soup.select_one("div.sr-2 > span:nth-of-type(2)").text)   # or:
print(soup.select("div.sr-2 > span")[1].text)                   # or: 
print(soup.find("div", class_="sr-2").find_all("span")[1].text)

Prints:

Outcome 2234
Outcome 2234
Outcome 2234

Upvotes: 4

Vishesh Mangla
Vishesh Mangla

Reputation: 753

You can use find_all and then slicing to get the second one.

container.find('div', class_='sr-2').find_all("span")[1].text

Upvotes: 1

Related Questions