Reputation: 556
I am a beginner in web scraping with Beautiful Soup and I am trying to get the teams with their wins and losses in Euroleague from https://www.basketball-reference.com/international/euroleague/2020.html
I want to iterate through this table and get the name wins and losses and insert them in a CSV a list or JSON file later. With my code below I can get only the HTML of the first element in the table even if I try a for loop :
from bs4 import BeautifulSoup as bs
import requests
from requests import get
import pandas as pd
import json
import time
from time import sleep
url = 'https://www.basketball-reference.com/international/euroleague/2020.html'
time.sleep(2)
source = requests.get(url).text
time.sleep(4)
soup = bs(source,'lxml')
time.sleep(2)
for item in soup.find_all('div' , class_='table_outer_container'):
#prints only first item
team=item.div.table.tbody.tr
print(team)
The structure of the table element :
<div class="table_outer_container">
<div class="overthrow table_container" id="div_elg_standings">
<table class="sortable stats_table now_sortable" id="elg_standings" data-cols-to-freeze="1"><caption>EuroLeague Standings Table</caption>
<colgroup><col><col><col></colgroup>
<thead>
<tr class="over_header"><th></th>
<th aria-label="" data-stat="Regular Season" colspan="2" class=" over_header center">Regular Season</th>
</tr>
<tr>
<th aria-label=" " data-stat="team" scope="col" class=" poptip center"> </th>
<th aria-label="Wins" data-stat="wins|Regular Season" scope="col" class=" poptip right" data-tip="Wins" data-over-header="Regular Season">W</th>
<th aria-label="Losses" data-stat="losses|Regular Season" scope="col" class=" poptip right" data-tip="Losses" data-over-header="Regular Season">L</th>
</tr>
</thead>
<tbody>
<tr data-row="0"><th scope="row" class="left " data-stat="team"><a href="/international/teams/anadolu-efes/2020.html">Anadolu Efes</a></th><td class="right " data-stat="wins|Regular Season">24</td><td class="right " data-stat="losses|Regular Season">4</td></tr>
<tr data-row="1"><th scope="row" class="left " data-stat="team"><a href="/international/teams/real-madrid/2020.html">Real Madrid</a></th><td class="right " data-stat="wins|Regular Season">22</td><td class="right " data-stat="losses|Regular Season">6</td></tr>
<tr data-row="2"><th scope="row" class="left " data-stat="team"><a href="/international/teams/barcelona/2020.html">FC Barcelona</a></th><td class="right " data-stat="wins|Regular Season">22</td><td class="right " data-stat="losses|Regular Season">6</td></tr>
<tr data-row="3"><th scope="row" class="left " data-stat="team"><a href="/international/teams/cska-moscow/2020.html">CSKA Moscow</a></th><td class="right " data-stat="wins|Regular Season">19</td><td class="right " data-stat="losses|Regular Season">9</td></tr>
<tr data-row="4"><th scope="row" class="left " data-stat="team"><a href="/international/teams/maccabi-tel-aviv/2020.html">Maccabi FOX Tel Aviv</a></th><td class="right " data-stat="wins|Regular Season">19</td><td class="right " data-stat="losses|Regular Season">9</td></tr>
<tr data-row="5"><th scope="row" class="left " data-stat="team"><a href="/international/teams/panathinaikos/2020.html">Panathinaikos OPAP</a></th><td class="right " data-stat="wins|Regular Season">14</td><td class="right " data-stat="losses|Regular Season">14</td></tr>
<tr data-row="6"><th scope="row" class="left " data-stat="team"><a href="/international/teams/ulker-fenerbahce/2020.html">Fenerbahçe Beko</a></th><td class="right " data-stat="wins|Regular Season">13</td><td class="right " data-stat="losses|Regular Season">15</td></tr>
<tr data-row="7"><th scope="row" class="left " data-stat="team"><a href="/international/teams/khimki/2020.html">Khimki</a></th><td class="right " data-stat="wins|Regular Season">13</td><td class="right " data-stat="losses|Regular Season">15</td></tr>
<tr data-row="8"><th scope="row" class="left " data-stat="team"><a href="/international/teams/vitoria/2020.html">Kirolbet Baskonia</a></th><td class="right " data-stat="wins|Regular Season">12</td><td class="right " data-stat="losses|Regular Season">16</td></tr>
<tr data-row="9"><th scope="row" class="left " data-stat="team"><a href="/international/teams/olympiakos/2020.html">Olympiacos</a></th><td class="right " data-stat="wins|Regular Season">12</td><td class="right " data-stat="losses|Regular Season">16</td></tr>
<tr data-row="10"><th scope="row" class="left " data-stat="team"><a href="/international/teams/zalgiris/2020.html">Žalgiris</a></th><td class="right " data-stat="wins|Regular Season">12</td><td class="right " data-stat="losses|Regular Season">16</td></tr>
<tr data-row="11"><th scope="row" class="left " data-stat="team"><a href="/international/teams/valencia/2020.html">Valencia Basket</a></th><td class="right " data-stat="wins|Regular Season">12</td><td class="right " data-stat="losses|Regular Season">16</td></tr>
<tr data-row="12"><th scope="row" class="left " data-stat="team"><a href="/international/teams/milano/2020.html">AX Armani Exchange Olimpia</a></th><td class="right " data-stat="wins|Regular Season">12</td><td class="right " data-stat="losses|Regular Season">16</td></tr>
<tr data-row="13"><th scope="row" class="left " data-stat="team"><a href="/international/teams/red-star/2020.html">Crvena zvezda mts</a></th><td class="right " data-stat="wins|Regular Season">11</td><td class="right " data-stat="losses|Regular Season">17</td></tr>
<tr data-row="14"><th scope="row" class="left " data-stat="team"><a href="/international/teams/villeurbanne/2020.html">LDLC ASVEL</a></th><td class="right " data-stat="wins|Regular Season">10</td><td class="right " data-stat="losses|Regular Season">18</td></tr>
<tr data-row="15"><th scope="row" class="left " data-stat="team"><a href="/international/teams/alba-berlin/2020.html">Alba Berlin</a></th><td class="right " data-stat="wins|Regular Season">9</td><td class="right " data-stat="losses|Regular Season">19</td></tr>
<tr data-row="16"><th scope="row" class="left " data-stat="team"><a href="/international/teams/triumph-moscow/2020.html">Zenit Saint Petersburg</a></th><td class="right " data-stat="wins|Regular Season">8</td><td class="right " data-stat="losses|Regular Season">20</td></tr>
<tr data-row="17"><th scope="row" class="left " data-stat="team"><a href="/international/teams/bayern-muenchen/2020.html">Bayern Munich</a></th><td class="right " data-stat="wins|Regular Season">8</td><td class="right " data-stat="losses|Regular Season">20</td></tr>
</tbody></table>
</div>
</div>
I would appreciate your help with guiding me to iterate through this element correctly and get the team name, wins, and losses. Thank you in advance.
Upvotes: 1
Views: 58
Reputation: 3123
Try this:
Code
import requests
from bs4 import BeautifulSoup
url = 'https://www.basketball-reference.com/international/euroleague/2020.html'
soup = BeautifulSoup(requests.get(url).text, 'html.parser')
teams = soup.find('div', class_='table_outer_container')
for team in teams.find_all('a'):
# prints only first item
team_name = team.text
wins = team.parent.parent.find('td', {'data-stat': 'wins|Regular Season'}).text
losses = team.parent.parent.find('td', {'data-stat': 'losses|Regular Season'}).text
print(team_name, wins, losses)
Output
Anadolu Efes 24 4
Real Madrid 22 6
FC Barcelona 22 6
CSKA Moscow 19 9
Maccabi FOX Tel Aviv 19 9
Panathinaikos OPAP 14 14
Fenerbahçe Beko 13 15
Khimki 13 15
Kirolbet Baskonia 12 16
Olympiacos 12 16
Žalgiris 12 16
Valencia Basket 12 16
AX Armani Exchange Olimpia 12 16
Crvena zvezda mts 11 17
LDLC ASVEL 10 18
Alba Berlin 9 19
Zenit Saint Petersburg 8 20
Bayern Munich 8 20
Upvotes: 1