Reputation: 147
I am trying to build a web crawler to gather betting data from multiple betting sites. I have some programming experience but I am very lost in the world of web pages, web scraping etc.
I have previously used Selenium to build "bots" and I think I could do something with that, I have also read some tutorials (urllib
, Beautiful Soup, etc.) but all those tutorials scrape very simple pages and the ones I want seem to be somewhat different (JavaScript, perhaps?)
For example, this page:
https://sportsbet.io/sports/pre-live/category/kq9kajLnphopJwuwh
How could I get the events with odds, etc.?
Upvotes: 1
Views: 466
Reputation: 1570
I found Web Scraping with Python: Collecting Data from the Modern Web to be a wonderful book which doesn't assume any experience with web scraping and only assumes that you know the basics of python.
The author takes you through scenarios from as simple as scraping a basic, static HTML page all the way to Javascript/Ajax ridden sites which may have some protections against scraping.
In general, the book shows examples using the Requests
module for downloading and the BeautifulSoup
module for parsing the html.
It also gives an example of how to make your scripts use tor to obscure your IP address.
Note that I am in no way affiliated with the seller(s) of the book; It's just that I have found this book immensely useful and it sounds like you will, too!
Upvotes: 2