Reputation: 471
I am trying to build a web scraper (code below) but I am always getting this error:
Traceback (most recent call last):
File "wikipedia.py", line 11, in <module>
for table in match.find_all('table'):
File "/Users/claycrosby/opt/anaconda3/lib/python3.8/site-packages/bs4/element.py", line 2160, in __getattr__
raise AttributeError(
AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?
I've tried swapping find
for find_all
as per the message but it doesn't change the error that is returned. Further, there are multiple table
s nested within the vevent summary
which I initially find
I've tried multiple iterations of going directly from vevent summary to tr and I get the same error
import requests
from bs4 import BeautifulSoup
url = 'https://en.wikipedia.org/wiki/2020%E2%80%9321_Top_14_season'
response = requests.get(url)
html = response.content
soup = BeautifulSoup(html, 'html.parser')
match = soup.find_all('.vevent summary')
#for table in match.find_all('table'):
for data in match.find_all('tbody'):
for row in data.find('tr'):
for cell in row.find('td'):
print (cell.text.replace(' ', ''))
Upvotes: 1
Views: 48
Reputation: 442
You can try this:
from bs4 import BeautifulSoup
import urllib.request
import bs4 as bs
url_1 = 'https://en.wikipedia.org/wiki/2020%E2%80%9321_Top_14_season'
sauce_1 = urllib.request.urlopen(url_1).read()
soup_1 = bs.BeautifulSoup(sauce_1, 'lxml')
for table in soup_1.find_all('table'):
print(table.text)
Upvotes: 0
Reputation: 11342
In Beautiful Soup, use find_all("div", {"class": "vevent"})
to search by class name. The error was referring to match
which returns a list.
Try this code:
import requests
from bs4 import BeautifulSoup
url = 'https://en.wikipedia.org/wiki/2020%E2%80%9321_Top_14_season'
response = requests.get(url)
html = response.content
soup = BeautifulSoup(html, 'html.parser')
match = soup.find_all("div", {"class": "vevent"}) # returns list
print('matches',len(match))
for m in match:
for table in m.find_all('table'):
for data in table.find_all('tbody'):
for row in data.find_all('tr'):
for cell in row.find_all('td'):
print (cell.text.replace(' ', ''))
Upvotes: 1