Reputation: 13
I'm new to programming and I'm trying to scrape a website.
The website is an online casino (https://www.888casino.it/live-casino/#filters=all-roulette), and I need to scrape just one of the numbers displayed (the number contained in a particular position changes more or less every 30 seconds, but I will think about this later).
<div class="sc-qbELi jLgZIw">
<span>2</span>
</div>
The number I want to scrape is contained within span tags, which I am unable to locate as they have no id or class. As a consequence, I tought about locating the div tag that contains the span tag, and then scraping the number contained in the span tags using functions such as .contents or .next_element or .children
In order to locate the div tag (it is not the first div tag in the html code, and it is located within many other div tags):
I imported the modules and set the link to the webpage:
from bs4 import BeautifulSoup
import requests
import urllib.request
url = 'https://www.888casino.it/live-casino/#filters=all-roulette'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
I tried the following three solutions:
.
div_tag = soup.findAll('div', class_='sc-qbELi jLgZIw')
div_tag = soup.find("div", class_="sc-qbELi jLgZIw")
div_tag = soup.select("div.jLgZIw.sc-qbELi")
The problem is that when printed, these lines of codes' output is respectively: [ ], None, [ ]. So when I try to add .children or .content to div_tag, I don't get anything as well.
I would be glad if you could help me figure out how to do this. Thanks for your attention
Upvotes: 1
Views: 1155
Reputation: 5648
I had to use selenium. The website is most likely dynamically loaded
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome('chromedriver.exe', options=chrome_options)
url = 'https://www.888casino.it/live-casino/#filters=all-roulette'
driver.get(url)
time.sleep(5)
html = driver.page_source
soup = BeautifulSoup(html, "html.parser")
Using
len(soup.find_all(class_="sc-qbELi jLgZIw"))
gives length of 50. You'd have to figure out how to get the right one, but this yields the output to get you started
Upvotes: 1