Programmer
Programmer

Reputation: 1294

Scraping table in python

Could someone please help me scrape data from the big table on https://www.statsinsider.com.au/prediction-results?fbclid=IwAR18wxeCq_ygxLG1v2JEe3YqBNNS6krzNnOQULYp4IZihQY6JMgHwzpIl6o

I have some foundation here:

from bs4 import BeautifulSoup
from requests_html import HTMLSession
session = HTMLSession()
url = 'https://www.statsinsider.com.au/prediction-results?fbclid=IwAR18wxeCq_ygxLG1v2JEe3YqBNNS6krzNnOQULYp4IZihQY6JMgHwzpIl6o'
r = session.get(url)
soup=BeautifulSoup(r.html.html,'html.parser')
stat_table = soup.find('table')

this outputs the following, which doesn't seem to be the entire table. Help appreciated, thanks!

<table>
<tbody>
<tr>
<th>Date</th>
<th class="to-hide">Sport</th>
<th>Team</th>
<th class="to-hide">Bet Type</th>
<th>Odds</th>
<th class="to-hide">Bet</th>
<th>Result</th>
<th>Profit/Loss</th>
</tr>
<tr ng-repeat="match in recentResults">
<td>{{match.Date}}</td>
<td class="to-hide">{{match.Sport}}</td>
<td>{{match.Team}}</td>
<td class="to-hide">{{match.Type}}</td>
<td>${{match.Odds}}</td>
<td class="to-hide">${{match.Bet}}</td>
<td>{{match.Result}}</td>
<td class="green" ng-if="match.Return &gt; 0">${{match.Return}}</td>
<td class="red" ng-if="match.Return &lt; 0">${{match.Return}}</td>
<td ng-if="match.Return == 0"></td>
</tr>
</tbody>
</table>

Upvotes: 1

Views: 1482

Answers (2)

Bitto
Bitto

Reputation: 8215

Since you are already using requests, you may want to consider using Requests-HTML. Although it's capabilities are not as advanced as selenium, it is quite useful in cases like this where you just want the page rendered.

To Install

pip install requests-html

The table in the link you provided can be easily scraped using Requests-HTML

Code:

from bs4 import BeautifulSoup
from requests_html import HTMLSession
session = HTMLSession()
url = 'https://www.statsinsider.com.au/prediction-results?fbclid=IwAR18wxeCq_ygxLG1v2JEe3YqBNNS6krzNnOQULYp4IZihQY6JMgHwzpIl6o'
r = session.get(url)
r.html.render()
soup=BeautifulSoup(r.html.html,'html.parser')
stat_table = soup.find('table')
print(stat_table)

Output

<table>
<tbody>
<tr>
<th>Date</th>
<th class="to-hide">Sport</th>
<th>Team</th>
<th class="to-hide">Bet Type</th>
<th>Odds</th>
<th class="to-hide">Bet</th>
<th>Result</th>
<th>Profit/Loss</th>
</tr>

...

<tr class="ng-scope" ng-repeat="match in recentResults">
<td class="ng-binding">17/09</td>
<td class="to-hide ng-binding">NFL</td>
<td class="ng-binding">NO</td>
<td class="to-hide ng-binding">Line</td>
<td class="ng-binding">$1.91</td>
<td class="to-hide ng-binding">$25</td>
<td class="ng-binding">LOSE</td>
<!-- ngIf: match.Return > 0 -->
<!-- ngIf: match.Return < 0 --><td class="red ng-binding ng-scope" ng-if="match.Return &lt; 0">$-25.00</td><!-- end ngIf: match.Return < 0 -->
<!-- ngIf: match.Return == 0 -->
</tr><!-- end ngRepeat: match in recentResults -->
</tbody>
</table>

Upvotes: 2

balderman
balderman

Reputation: 23815

This table is created dynamically using AJAX call.

The page is fetching 3 JSON documents - one of them is the one that you are looking for.

  1. https://gazza.statsinsider.com.au/results.json?sport=NFL
  2. https://gazza.statsinsider.com.au/sportladder.json?sport=nba
  3. https://gazza.statsinsider.com.au/upcoming.json

All you need to do is HTTP GET to each of the URL's above and check which one of them is the table mode. Once you find the right URL, use requests and get the data.

Upvotes: 2

Related Questions