Reputation: 75
I'm scraping from the following page: https://www.pro-football-reference.com/boxscores/201809060phi.htm
I have this code:
import requests
from bs4 import BeautifulSoup
url = 'https://www.pro-football-reference.com/boxscores/201809060phi.htm'
r = requests.get(url)
soup = BeautifulSoup(r.text, 'lxml')
tables = soup.findAll("div",{"class":"table_outer_container"})
print (len(tables))
Each table on the page has the element "div",{"class":"table_outer_container"}. But my print statement only returns 1. Am I wrong in believing that my findAll statement will assign all of those elements to the variable, "tables"?
Upvotes: 0
Views: 28
Reputation: 22440
It's because most of the tables are within comments and your script wont grab them unless you kick out those vicious signs -->
,<!--
from response. Try the following. It should give you 20 tables from that page.
import requests
from bs4 import BeautifulSoup
url = 'https://www.pro-football-reference.com/boxscores/201809060phi.htm'
r = requests.get(url).text
res = r.replace("<!--","").replace("-->","")
soup = BeautifulSoup(res, 'lxml')
tables = soup.findAll("div",{"class":"table_outer_container"})
print (len(tables))
Upvotes: 1