Reputation: 11
Hello everyone I have a bit of a specific question to ask today, how do I scrape the data from a website that is consistently changing, such as an online gambling site. When I execute this code I wrote
import requests
from bs4 import BeautifulSoup
def ColorRequest():
url = 'http://csgoroll.com/#/' # Could add a + pls str(pagesomething) to add on to the url so that it would update
sourcecode = requests.get(url) #requests the data from the site
plaintext = sourcecode.text #imports all of the data gathered
soup = BeautifulSoup(plaintext, 'html.parser') #This hold all of the data, and allows you to sort through all of the data, converts it
for links in soup.findAll():
print(links)
ColorRequest()
I get a html output of the page, but I am looking for the elements that are being displayed after the page loads, not what makes up that page.
Any experienced Python developers ever run into this problem and would please help an inexperienced programmer out?
Upvotes: 1
Views: 449
Reputation: 4983
Here's a "direct" way to do this type of scraping.
Normally these "continuously-changing" website is updated via AJAX, so what you really should be looking for is the specific request used for updating website content.
You can use fiddler to capture the traffic while the website is updating, and then find out which request is the one contains valid info you need(in this case, probably odds or whatever). Once you found it, just simulate the request and extract any info you need.
Upvotes: 0
Reputation: 15
There are a number of ways to do this. Avi gives an example of using dryscrape with beautiful soup to do this in the question below.
Web-scraping JavaScript page with Python
I do not have any experience with dryscrape, but you could also do this using selenium webdriver with a headless browser like phantomJS.
Upvotes: 1