Warren Crasta
Warren Crasta

Reputation: 488

BeautifulSoup - Scraping Multiple Tables from a page?

I'm trying to scrape the content from this URL which contains multiple tables. The desired output would be:

NAME        FG% FT% 3PM REB AST STL BLK TO  PTS     SCORE
Team Jackson (0-8)      .4313   .7500   21  71  34  11  12  15  189     1-8-0
Team Keyrouze (4-4)     .4441   .8090   31  130 71  18  13  45  373     8-1-0
Nutz Vs. Draymond Green (4-4)       .4292   .8769   30  86  66  15  9   28  269     3-6-0
Team Pauls 2 da Wall (3-5)      .4784   .8438   40  123 64  18  20  30  316     6-3-0
Team Noey (2-6)     .4350   .7679   21  125 62  20  9   33  278     7-2-0
YOU REACH, I TEACH (2-5-1)      .4810   .7432   20  114 56  30  7   50  277     2-7-0
Kris Kaman His Pants (5-3)      .4328   .8000   20  74  59  20  5   27  238     3-6-0
Duke's Balls In Daniels Face (3-4-1)        .5000   .7045   42  139 38  27  22  30  303     6-3-0
Knicks Tape (5-3)       .5000   .8152   34  143 92  12  9   47  397     4-5-0
Suck MyDirk (5-3)       .4734   .8814   29  106 86  22  17  40  435     5-4-0
In Porzingod We Trust (4-4)     .4928   .7222   27  180 95  16  16  46  423     7-2-0
Team Aguilar (6-1-1)        .4718   .7053   28  177 65  12  35  48  413     2-7-0
Team Li (7-0-1)     .4714   .8118   35  134 74  17  17  47  368     6-3-0
Team Iannetta (4-4)     .4527   .7302   22  125 90  20  13  44  288     3-6-0

If it's too difficult to format the tables like that, I'd like to know how I can scrape all the tables? My code to scrape all rows is like this:

tableStats = soup.find('table', {'class': 'tableBody'})
rows = tableStats.findAll('tr')

for row in rows:
    print(row.string)

But it only prints the value "TEAM" and nothing else... Why doesn't it contain all the rows in the table?

Thanks.

Upvotes: 1

Views: 4124

Answers (2)

Warren Crasta
Warren Crasta

Reputation: 488

Found a way to exactly get the 2-D matrix I specified in the question. It's stored as the list teams.

Code:

from bs4 import BeautifulSoup
import requests

source_code = requests.get("http://games.espn.com/fba/scoreboard?leagueId=224165&seasonId=2017")
plain_text = source_code.text
soup = BeautifulSoup(plain_text, 'lxml')
teams = []
rows = soup.findAll('tr', {'class': 'linescoreTeamRow'})

# Creates a 2-D matrix.
for row in range(len(rows)):
    team_row = []
    columns = rows[row].findAll('td')
    for column in columns:
        team_row.append(column.getText())
    print(team_row)
    # Add each team to a teams matrix.
    teams.append(team_row)

Output:

['Team Jackson (0-10)', '', '.4510', '.8375', '41', '135', '101', '23', '11', '50', '384', '', '5-4-0']
['YOU REACH, I TEACH (3-6-1)', '', '.4684', '.7907', '22', '169', '103', '22', '10', '32', '342', '', '4-5-0']
['Nutz Vs. Draymond Green (4-6)', '', '.4552', '.8372', '30', '157', '68', '15', '16', '39', '356', '', '2-7-0']
["Jesse's  Blue Balls (4-5-1)", '', '.4609', '.7576', '47', '158', '71', '30', '20', '38', '333', '', '7-2-0']
['Team Noey (4-6)', '', '.4763', '.8261', '42', '164', '70', '25', '29', '44', '480', '', '5-4-0']
['Suck MyDirk (6-3-1)', '', '.4733', '.8403', '54', '160', '132', '23', '11', '47', '544', '', '4-5-0']
['Kris Kaman  His Pants (5-5)', '', '.4569', '.8732', '53', '138', '105', '27', '21', '53', '465', '', '6-3-0']
['Team Aguilar (6-3-1)', '', '.4433', '.7229', '40', '202', '68', '30', '22', '54', '452', '', '3-6-0']
['Knicks Tape (6-3-1)', '', '.4406', '.8824', '52', '172', '108', '24', '13', '49', '513', '', '6-3-0']
['Team Iannetta (4-6)', '', '.5321', '.6923', '24', '146', '94', '32', '16', '60', '428', '', '3-6-0']
['In Porzingod We Trust (6-4)', '', '.4694', '.6364', '37', '216', '133', '31', '21', '77', '468', '', '4-5-0']
['Team Keyrouze (6-4)', '', '.4705', '.8854', '51', '135', '108', '25', '17', '43', '550', '', '5-4-0']
['Team Li (8-1-1)', '', '.4369', '.8182', '57', '203', '130', '34', '22', '54', '525', '', '6-3-0']
['Team Pauls 2 da Wall (5-5)', '', '.4780', '.5970', '27', '141', '47', '19', '25', '28', '263', '', '3-6-0']

Upvotes: 0

martianwars
martianwars

Reputation: 6500

Instead of looking for the table tag, you should look for the rows directly with a more dependable class, such as linescoreTeamRow. This code snippet does the trick,

from bs4 import BeautifulSoup
import requests
a = requests.get("http://games.espn.com/fba/scoreboard?leagueId=224165&seasonId=2017")
soup = BeautifulSoup(a.text, 'lxml')
# searching for the rows directly
rows = soup.findAll('tr', {'class': 'linescoreTeamRow'})
# you will need to isolate elements in the row for the table
for row in rows:
    print row.text

Upvotes: 3

Related Questions