Roberto Corral
Roberto Corral

Reputation: 41

Getting table from webpage: Problem getting full html

I need to get the table from this page: https://stats.nba.com/teams/traditional/?sort=GP&dir=-1. From the html of the page one can see that the table is encoded in the descendants of the tag

<nba-stat-table filters="filters" ... >
 <div class="nba-stat-table">
  <div class="nba-stat-table__overflow" data-fixed="2" role="grid">
   <table>
    ...
</nba-stat-table>

(I cannot add a screenshot since I am new to stackoverflow but just doing: right click -> inspect element wherever in the table you will see what I mean).

I've tried some different ways such as the first and second answer to this question How to extract tables from websites in Python as well as those to this other question pandas read_html ValueError: No tables found (since trying the first solution I've got an error which is essentially this second question).

First try using pandas:

import requests
import pandas as pd

url = 'http://stats.nba.com/teams/traditional/?sort=GP&dir=-1'
html = requests.get(url).content
df_list = pd.read_html(html)
df = df_list[-1]

Or another try with BeautifulSoup:

import requests
from bs4 import BeautifulSoup

url = "https://stats.nba.com/teams/traditional/?sort=GP&dir=-1"
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
stats_table = soup.find('nba-stat-table')
for child in stats_table.descendants:
    print(child)

For the first I got ''pandas read_html ValueError: No tables found'' error. For the second I didn't get any error but nothing showing. Then, I have tried to see on a file what was actually happening by doing:

with open('html.txt', 'w') as fout:
    fout.write(str(page.content))

and/or:

with open('html.txt', 'w') as fout:
    fout.write(str(soup))

and I get in the text file in the part of the html in which the table should be:

<nba-stat-table filters="filters" 
                ng-if="!isLoading &amp;&amp; !noData" 
                options="options" 
                params="params" 
                rows="teamStats.rows" 
                template="teams/teams-traditional">
</nba-stat-table>

So it appears that I am not getting all the descendats of this tag which actually contains the information of the table. Then, does someone has a solution which obtains the whole html of the page and so it allows me for parsing it or instead an alternative solution to obtaining the table?

Upvotes: 0

Views: 360

Answers (2)

chitown88
chitown88

Reputation: 28565

Here's what I try when attempting to scrape data. (By the way I LOVE scraping/working with sports data.)

1) Pandas pd.read_html(). (beautifulSoup actually works under the hood here). I like this method as it's rather easy and quick. Usually only requires a small amount of manipulation if it does return what I want. The pandas' pd.read_html() only works if the data is within <table> tags though in the html. Since there are no <table> tags here, it will return what you stated as "ValueError: No tables found". So good work on trying that first, it's the easiest method when it works.

2) The other "go to" method I'll use, is then to see if the data is pulled through XHR. Actually, this might be my first choice as it can give you options of being able to filter what is returned, but requires a little more (not much) investigated work to find the correct request url and query parameter. (This is the route I went for this solution).

3) If it is generated through javascript, sometimes you can find the data in json format with <script> tags using BeautifulSoup. this requires a bit more investigation of pulling out the right <script> tag, then doing string manipulation to get the string in a valid json format to be able to use json.loads() to read in the data.

4a) Use BeautifulSoup to pull out the data elements if they are present in other tags and not rendered by javascript.

4b) Selenium is an option to allow the page to render first, then go into the html and parse with BeautifulSoup (in some cases allow Selenium to render and then could use pd.read_html() if it renders <table> tags), but is usually my last choice. It's not that it doesn't work or is bad, it just slow and unnecessary if any of the above choices work.

So I went with option 2. Here's the code and output:

import requests
import pandas as pd

url = 'https://stats.nba.com/stats/leaguedashteamstats'

headers = {'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Mobile Safari/537.36'}

payload = {
'Conference': '',
'DateFrom': '',
'DateTo': '',
'Division': '',
'GameScope': '',
'GameSegment': '',
'LastNGames': '82',
'LeagueID': '00',
'Location': '',
'MeasureType': 'Base',
'Month': '0',
'OpponentTeamID': '0',
'Outcome': '',
'PORound': '0',
'PaceAdjust': 'N',
'PerMode': 'PerGame',
'Period': '0',
'PlayerExperience': '',
'PlayerPosition': '',
'PlusMinus': 'N',
'Rank': 'N',
'Season': '2019-20',
'SeasonSegment': '',
'SeasonType': 'Regular Season',
'ShotClockRange': '',
'StarterBench': '',
'TeamID': '0',
'TwoWay': '0',
'VsConference':'', 
'VsDivision':'' }

jsonData = requests.get(url, headers=headers, params=payload).json()
df = pd.DataFrame(jsonData['resultSets'][0]['rowSet'], columns=jsonData['resultSets'][0]['headers'])

Output:

print (df.to_string())
       TEAM_ID               TEAM_NAME  GP  W  L  W_PCT   MIN   FGM    FGA  FG_PCT  FG3M  FG3A  FG3_PCT   FTM   FTA  FT_PCT  OREB  DREB   REB   AST   TOV   STL  BLK  BLKA    PF   PFD    PTS  PLUS_MINUS  GP_RANK  W_RANK  L_RANK  W_PCT_RANK  MIN_RANK  FGM_RANK  FGA_RANK  FG_PCT_RANK  FG3M_RANK  FG3A_RANK  FG3_PCT_RANK  FTM_RANK  FTA_RANK  FT_PCT_RANK  OREB_RANK  DREB_RANK  REB_RANK  AST_RANK  TOV_RANK  STL_RANK  BLK_RANK  BLKA_RANK  PF_RANK  PFD_RANK  PTS_RANK  PLUS_MINUS_RANK  CFID                CFPARAMS
0   1610612737           Atlanta Hawks   4  2  2  0.500  48.0  39.5   84.3   0.469  10.0  31.8    0.315  16.0  23.0   0.696   8.5  34.5  43.0  25.0  18.0  10.0  5.3   7.3  23.8  21.5  105.0         1.0        1      11      14          14        10        14        27            5         23         21            21        24        21           27         24         21        25         9        19         3        15         29       17        25        22               15    10           Atlanta Hawks
1   1610612738          Boston Celtics   3  2  1  0.667  48.0  39.0   97.3   0.401  11.7  35.0    0.333  18.0  26.3   0.684  14.7  33.0  47.7  21.0  11.3   9.3  6.3   5.7  25.0  29.3  107.7         5.0       19      11       4          11        10        18         3           28         13         12            18        19        12           28          2         25        13        22         1         7         7         19       20         1        16               10    10          Boston Celtics
2   1610612751           Brooklyn Nets   3  1  2  0.333  51.3  43.0   93.3   0.461  15.3  38.7    0.397  22.7  32.3   0.701  10.7  38.3  49.0  22.7  19.7   8.3  5.3   6.0  26.0  27.3  124.0         0.7       19      18      14          18         1         4         9            8          3          7             4         7         3           24         11          9         7        19        26        15        13         22       21         3         1               16    10           Brooklyn Nets
3   1610612766       Charlotte Hornets   4  1  3  0.250  48.0  38.3   86.5   0.442  14.8  36.8    0.401  14.3  19.8   0.722  10.0  31.5  41.5  24.3  19.3   5.3  4.0   6.5  22.5  21.8  105.5       -13.8        1      18      23          23        10        23        23           16          4          9             3        26        26           23         14         28        29        14        24        30        24         25       10        22        21               28    10       Charlotte Hornets
4   1610612741           Chicago Bulls   4  1  3  0.250  48.0  38.5   95.0   0.405   9.8  35.5    0.275  17.5  23.0   0.761  12.3  32.0  44.3  20.0  12.8  10.0  4.5   7.0  21.3  20.8  104.3        -6.0        1      18      23          23        10        20         6           27         24         11            29        20        21           15          7         26        24        26         2         3        20         27        7        26        24               23    10           Chicago Bulls
5   1610612739     Cleveland Cavaliers   3  1  2  0.333  48.0  39.3   89.3   0.440  10.7  34.7    0.308  13.0  18.7   0.696  10.7  38.0  48.7  20.7  15.7   6.3  4.0   4.7  19.0  19.3  102.3        -5.0       19      18      14          18        10        17        16           19         19         13            24        29        27           26         11         10        10        23        13        25        24         11        3        30        26               22    10     Cleveland Cavaliers
6   1610612742        Dallas Mavericks   4  3  1  0.750  48.0  39.5   86.8   0.455  12.8  40.8    0.313  23.0  31.0   0.742   9.8  36.0  45.8  24.0  13.0   6.8  5.0   2.8  19.3  27.0  114.8         4.0        1       1       4           4        10        14        21           10          8          5            22         5         4           19         17         15        21        15         3        19        19          1        4         7        10               12    10        Dallas Mavericks
7   1610612743          Denver Nuggets   4  3  1  0.750  49.3  37.3   90.5   0.412  11.5  31.8    0.362  19.8  24.3   0.814  13.0  35.5  48.5  22.0  14.3   8.0  5.5   4.5  22.8  23.8  105.8         3.3        1       1       4           4         4        27        13           25         14         21            11        13        20            7          4         19        11        20         7        16        11          9       13        14        20               13    10          Denver Nuggets
8   1610612765         Detroit Pistons   4  2  2  0.500  48.0  38.5   80.0   0.481  10.5  26.0    0.404  19.0  25.3   0.752   8.3  33.5  41.8  21.8  18.8   6.0  5.3   3.8  21.8  21.8  106.5        -3.0        1      11      14          14        10        20        29            3         20         28             2        15        15           17         26         24        28        21        21        27        15          4        9        22        18               21    10         Detroit Pistons
9   1610612744   Golden State Warriors   3  1  2  0.333  48.0  40.0   98.3   0.407  11.3  36.7    0.309  24.7  28.3   0.871  15.3  32.0  47.3  27.0  15.3   9.3  1.3   5.7  19.3  23.3  116.0       -12.0       19      18      14          18        10        10         2           26         15         10            23         3         8            2          1         26        14         4        10         7        30         19        5        17         9               27    10   Golden State Warriors
10  1610612745         Houston Rockets   3  2  1  0.667  48.0  38.3   91.3   0.420  13.0  45.7    0.285  28.0  34.0   0.824   9.3  38.0  47.3  24.3  15.7   6.3  5.3   5.0  23.7  28.0  117.7         0.3       19      11       4          11        10        22        11           23          6          3            27         1         1            5         21         10        14        12        13        25        13         15       15         2         8               18    10         Houston Rockets
11  1610612754          Indiana Pacers   3  0  3  0.000  48.0  39.7   90.0   0.441   8.0  23.3    0.343  13.7  16.7   0.820   9.7  29.3  39.0  24.3  13.3   8.7  4.3   5.3  23.7  19.7  101.0        -7.3       19      28      23          28        10        13        14           18         29         30            14        27        29            6         19         30        30        12         5        10        23         18       15        28        27               26    10          Indiana Pacers
12  1610612746             LA Clippers   4  3  1  0.750  48.0  43.0   82.8   0.520  13.0  32.0    0.406  22.5  28.5   0.789   8.3  34.0  42.3  25.0  17.0   8.5  5.5   3.3  26.3  25.5  121.5         9.0        1       1       4           4        10         4        28            1          6         19             1         8         7           11         26         22        26         9        18        11        11          2       23        10         3                3    10             LA Clippers
13  1610612747      Los Angeles Lakers   4  3  1  0.750  48.0  40.0   87.5   0.457   9.8  29.0    0.336  19.5  24.5   0.796  10.0  36.0  46.0  23.5  15.3   8.5  8.0   3.5  21.5  24.3  109.3        11.8        1       1       4           4        10        10        17            9         24         25            17        14        17            8         14         15        19        17         9        11         1          3        8        12        15                1    10      Los Angeles Lakers
14  1610612763       Memphis Grizzlies   4  1  3  0.250  49.3  39.5   95.3   0.415   9.0  32.0    0.281  19.0  24.5   0.776  11.3  36.5  47.8  24.8  18.8   9.0  6.5   7.0  27.0  23.8  107.0       -13.8        1      18      23          23         4        14         5           24         27         19            28        15        17           14         10         14        12        11        21         9         5         27       26        14        17               28    10       Memphis Grizzlies
15  1610612748              Miami Heat   4  3  1  0.750  49.3  40.3   86.0   0.468  12.8  32.3    0.395  24.8  33.8   0.733   9.8  39.0  48.8  23.8  22.5   8.5  6.5   4.8  27.0  27.3  118.0         8.0        1       1       4           4         4         9        25            6          8         17             5         2         2           20         17          6         9        16        30        11         5         12       26         6         7                6    10              Miami Heat
16  1610612749         Milwaukee Bucks   3  2  1  0.667  49.7  45.0   95.0   0.474  16.7  46.0    0.362  17.3  25.7   0.675   6.3  43.7  50.0  27.3  13.7   8.0  7.0   4.0  24.7  25.7  124.0         6.0       19      11       4          11         2         2         6            4          2          1            10        21        14           29         29          2         3         3         6        16         2          5       19         9         1                9    10         Milwaukee Bucks
17  1610612750  Minnesota Timberwolves   3  3  0  1.000  49.7  42.7   96.7   0.441  12.7  42.0    0.302  23.3  30.7   0.761  13.0  37.0  50.0  25.7  15.3  10.7  3.7   7.7  20.0  27.3  121.3        10.0       19       1       1           1         2         6         4           17         10          4            25         4         5           15          4         13         3         5        10         1        28         30        6         3         4                2    10  Minnesota Timberwolves
18  1610612740    New Orleans Pelicans   4  0  4  0.000  49.3  45.5  100.8   0.452  16.8  45.8    0.366  13.3  18.3   0.726  12.0  34.0  46.0  30.8  16.3   8.0  5.3   4.0  26.5  21.8  121.0        -7.3        1      28      29          28         4         1         1           13          1          2             8        28        28           21          8         22        19         1        17        16        15          5       25        22         5               24    10    New Orleans Pelicans
19  1610612752         New York Knicks   4  1  3  0.250  48.0  37.8   87.0   0.434  10.8  27.8    0.387  18.8  28.0   0.670  13.8  35.3  49.0  18.8  20.3  10.0  3.8   5.3  27.0  23.0  105.0        -7.3        1      18      23          23        10        24        19           20         17         27             6        17        10           30          3         20         7        27        27         3        27         17       26        18        22               24    10         New York Knicks
20  1610612760   Oklahoma City Thunder   4  1  3  0.250  48.0  37.5   84.5   0.444  10.8  29.3    0.368  17.3  24.8   0.697   9.5  40.3  49.8  18.8  18.5   6.8  4.5   4.8  23.5  22.8  103.0         1.8        1      18      23          23        10        25        26           15         17         23             7        22        16           25         20          3         5        27        20        19        20         12       14        19        25               14    10   Oklahoma City Thunder
21  1610612753           Orlando Magic   3  1  2  0.333  48.0  35.3   91.3   0.387   8.7  33.3    0.260  16.7  21.0   0.794  10.7  35.7  46.3  20.3  13.0   9.7  5.7   4.3  17.7  20.3   96.0        -1.3       19      18      14          18        10        28        11           30         28         16            30        23        25            9         11         18        17        24         3         6         9          8        1        27        29               20    10           Orlando Magic
22  1610612755      Philadelphia 76ers   3  3  0  1.000  48.0  38.7   86.7   0.446  10.3  34.7    0.298  22.0  30.3   0.725  10.0  39.7  49.7  25.3  20.3  10.7  7.0   4.0  29.7  27.3  109.7         7.3       19       1       1           1        10        19        22           14         22         13            26        11         6           22         14          5         6         7        29         1         2          5       29         3        14                7    10      Philadelphia 76ers
23  1610612756            Phoenix Suns   4  2  2  0.500  49.3  39.8   87.5   0.454  12.3  34.5    0.355  22.3  26.8   0.832   7.8  39.0  46.8  27.8  16.0   8.5  4.0   6.5  31.3  27.0  114.0         8.8        1      11      14          14         4        12        17           11         12         15            13        10        11            4         28          6        16         2        15        11        24         25       30         7        11                4    10            Phoenix Suns
24  1610612757  Portland Trail Blazers   4  2  2  0.500  48.0  41.5   89.8   0.462   9.3  28.3    0.327  21.0  24.5   0.857   8.5  37.8  46.3  17.0  15.5   6.8  5.3   4.5  26.3  22.5  113.3         0.3        1      11      14          14        10         7        15            7         26         26            20        12        17            3         24         12        18        30        12        19        15          9       23        20        12               19    10  Portland Trail Blazers
25  1610612758        Sacramento Kings   4  0  4  0.000  48.0  34.3   86.5   0.396  11.0  32.3    0.341  16.0  21.5   0.744  11.5  30.8  42.3  18.8  18.8   6.5  4.5   5.0  22.5  22.0   95.5       -19.5        1      28      29          28        10        30        23           29         16         17            15        24        24           18          9         29        26        27        21        23        20         15       10        21        30               30    10        Sacramento Kings
26  1610612759       San Antonio Spurs   3  3  0  1.000  48.0  44.3   92.0   0.482   8.0  23.7    0.338  22.3  28.3   0.788  12.7  38.7  51.3  25.3  16.0   5.7  7.0   5.7  18.7  24.7  119.0         4.7       19       1       1           1        10         3        10            2         29         29            16         9         8           12          6          8         2         7        15        29         2         19        2        11         6               11    10       San Antonio Spurs
27  1610612761         Toronto Raptors   4  3  1  0.750  49.3  37.5   87.0   0.431  14.3  39.3    0.363  22.8  25.8   0.883   9.3  44.3  53.5  22.8  20.3   6.8  5.8   6.3  24.3  24.0  112.0         8.8        1       1       4           4         4        25        19           22          5          6             9         6        13            1         22          1         1        18        27        19         8         24       18        13        13                4    10         Toronto Raptors
28  1610612762               Utah Jazz   4  3  1  0.750  48.0  35.0   77.3   0.453  10.5  29.3    0.359  18.3  23.0   0.793   5.5  39.8  45.3  20.3  19.5   6.5  3.3   4.8  26.0  23.5   98.8         7.3        1       1       4           4        10        29        30           12         20         23            12        18        21           10         30          4        22        25        25        23        29         12       21        16        28                8    10               Utah Jazz
29  1610612764      Washington Wizards   3  1  2  0.333  48.0  41.0   95.0   0.432  12.7  38.7    0.328  11.7  15.0   0.778   9.0  36.0  45.0  25.7  15.0   6.0  5.7   6.0  22.7  19.7  106.3         0.7       19      18      14          18        10         8         6           21         10          7            19        30        30           13         23         15        23         5         8        27         9         22       12        28        19               16    10      Washington Wizards

Upvotes: 1

fumiya.f
fumiya.f

Reputation: 324

Using Selenium will be the best way to do it. Then you can get the whole content which is rendered by javascript.

https://towardsdatascience.com/simple-web-scraping-with-pythons-selenium-4cedc52798cd

Upvotes: 0

Related Questions