Xia
Xia

Reputation: 51

BeautifulSoup trying to get selected value scraping IMDB but getting error

I'm trying to get the selected value from following HTML using BeautifulSoup but unable to.

<select id="bySeason" tconst="tt0944947" class="current">
  <!--
  This ensures that we don't wind up accidentally marking two options
  (Unknown and the blank one) as selected.
  -->
  <option value="1">
    1
  </option>
  <!--
  This ensures that we don't wind up accidentally marking two options
  (Unknown and the blank one) as selected.
  -->
  <option selected="selected" value="8">
    2
  </option>
</select>

This is what I am trying but in vain.

season_container = page_html.find_all("select", class_="current")
print(season_container.find_all('option', selected=True))

Upvotes: 0

Views: 83

Answers (2)

bigbounty
bigbounty

Reputation: 17408

You can narrow your search by selecting using id.


from bs4 import BeautifulSoup

html = """<select id="bySeason" tconst="tt0944947" class="current">
  <!--
  This ensures that we don't wind up accidentally marking two options
  (Unknown and the blank one) as selected.
  -->
  <option value="1">
    1
  </option>
  <!--
  This ensures that we don't wind up accidentally marking two options
  (Unknown and the blank one) as selected.
  -->
  <option selected="selected" value="8">
    2
  </option>
</select>
"""

soup = BeautifulSoup(html, "html.parser")
selected_value = soup.find("select", {"id":"bySeason"}).find("option",selected=True)

print(selected_value.get_text(strip=True))
print("-------")
print(selected_value["value"])

Output:

2
-------
8

Upvotes: 1

Dominik Sajovic
Dominik Sajovic

Reputation: 841

You are very close.

season_container = page_html.find_all("select", class_="current")[0] # <- first ele. 
print(season_container.find_all('option', selected=True))

The first row returns an array so you have to specify to select (presumably) the first element. The other part of the code is fine.

Upvotes: 1

Related Questions