Reputation: 13
So I'm trying to scrape the tables off this website for NBA teams in years past and I'm trying to get a specific table off of it. I really just know how to scrape tables with pandas read_html function, so I've been doing that. When I used the length function, pandas told me there were only 5 tables, when there really are 14. This is the image that I want to get the data off, but Pandas doesn't think that this exists. The code that I used was as follows:
import pandas as pd
url = "https://www.basketball-reference.com/teams/BOS/1980.html"
tables= pd.read_html(url)
So when I run it, I look through all the tables and I only get 5 tables. Can anyone help?
Upvotes: 0
Views: 42
Reputation: 31226
pd.read_html()
import requests
from bs4 import BeautifulSoup
from bs4 import Comment
import pandas as pd
res = requests.get("https://www.basketball-reference.com/teams/BOS/1980.html")
id="div_team_and_opponent"
html = BeautifulSoup(res.content, 'html.parser')
pd.read_html(html.find_all(string=lambda text: isinstance(text, Comment) and id in text)[0])[0]
Unnamed: 0 | G | MP | FG | FGA | FG% | 3P | 3PA | 3P% | 2P | 2PA | 2P% | FT | FTA | FT% | ORB | DRB | TRB | AST | STL | BLK | TOV | PF | PTS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Team | 82 | 19880 | 3617 | 7387 | 0.49 | 162 | 422 | 0.384 | 3455 | 6965 | 0.496 | 1907 | 2449 | 0.779 | 1227 | 2457 | 3684 | 2198 | 809 | 308 | 1539 | 1974 | 9303 |
1 | Team/G | nan | 242.4 | 44.1 | 90.1 | 0.49 | 2 | 5.1 | 0.384 | 42.1 | 84.9 | 0.496 | 23.3 | 29.9 | 0.779 | 15.0 | 30.0 | 44.9 | 26.8 | 9.9 | 3.8 | 18.8 | 24.1 | 113.5 |
2 | Lg Rank | nan | 4 | 8 | 14 | 7 | 2 | 2 | 1 | 15 | 17 | 7 | 4 | 6 | 5 | 13 | 10 | 11 | 8 | 6 | 21 | 11 | 13 | 5 |
3 | Year/Year | nan | 1.0% | 2.6% | 0.5% | 0.009 | nan | nan | nan | -2.0% | -5.2% | 0.016 | 4.8% | 5.5% | -0.005 | 9.7% | 2.5% | 4.8% | 10.2% | 13.9% | 8.8% | -10.2% | -0.2% | 4.8% |
4 | Opponent | 82 | 19880 | 3439 | 7313 | 0.47 | 74 | 259 | 0.286 | 3365 | 7054 | 0.477 | 1712 | 2222 | 0.77 | 1168 | 2294 | 3462 | 1867 | 686 | 419 | 1635 | 2059 | 8664 |
5 | Opponent/G | nan | 242.4 | 41.9 | 89.2 | 0.47 | 0.9 | 3.2 | 0.286 | 41.0 | 86.0 | 0.477 | 20.9 | 27.1 | 0.77 | 14.2 | 28.0 | 42.2 | 22.8 | 8.4 | 5.1 | 19.9 | 25.1 | 105.7 |
6 | Lg Rank | nan | 4 | 6 | 7 | 8 | 17 | 17 | 15 | 5 | 7 | 8 | 11 | 10 | 17 | 6 | 4 | 2 | 3 | 2 | 11 | 9 | 6 | 6 |
7 | Year/Year | nan | 1.0% | -10.8% | -3.7% | -0.037 | nan | nan | nan | -12.7% | -7.1% | -0.031 | 8.5% | 6.9% | 0.011 | 4.1% | -6.5% | -3.2% | -14.0% | -4.3% | -4.3% | 2.0% | 1.7% | -6.7% |
Upvotes: 1