greenmeansgo
greenmeansgo

Reputation: 471

BeautifulSoup - AttributeError when using find_all on multiple `table`s

I am trying to build a web scraper (code below) but I am always getting this error:

Traceback (most recent call last):
  File "wikipedia.py", line 11, in <module>
    for table in match.find_all('table'):
  File "/Users/claycrosby/opt/anaconda3/lib/python3.8/site-packages/bs4/element.py", line 2160, in __getattr__
    raise AttributeError(
AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?

I've tried swapping find for find_all as per the message but it doesn't change the error that is returned. Further, there are multiple tables nested within the vevent summary which I initially find

I've tried multiple iterations of going directly from vevent summary to tr and I get the same error

HTML

import requests
from bs4 import BeautifulSoup

url = 'https://en.wikipedia.org/wiki/2020%E2%80%9321_Top_14_season'
response = requests.get(url)
html = response.content

soup = BeautifulSoup(html, 'html.parser')
match = soup.find_all('.vevent summary')

#for table in match.find_all('table'):
for data in match.find_all('tbody'):
    for row in data.find('tr'):
        for cell in row.find('td'):
            print (cell.text.replace('&nbsp;', ''))

Upvotes: 1

Views: 48

Answers (2)

Matteo Bianchi
Matteo Bianchi

Reputation: 442

You can try this:

from bs4 import BeautifulSoup
import urllib.request
import bs4 as bs

url_1 = 'https://en.wikipedia.org/wiki/2020%E2%80%9321_Top_14_season'
sauce_1  = urllib.request.urlopen(url_1).read()
soup_1 = bs.BeautifulSoup(sauce_1, 'lxml')

for table in soup_1.find_all('table'):
    print(table.text)

Upvotes: 0

Mike67
Mike67

Reputation: 11342

In Beautiful Soup, use find_all("div", {"class": "vevent"}) to search by class name. The error was referring to match which returns a list.

Try this code:

import requests
from bs4 import BeautifulSoup

url = 'https://en.wikipedia.org/wiki/2020%E2%80%9321_Top_14_season'
response = requests.get(url)
html = response.content

soup = BeautifulSoup(html, 'html.parser')
match = soup.find_all("div", {"class": "vevent"})  # returns list

print('matches',len(match))

for m in match:
  for table in m.find_all('table'):
      for data in table.find_all('tbody'):
          for row in data.find_all('tr'):
              for cell in row.find_all('td'):
                  print (cell.text.replace('&nbsp;', ''))

Upvotes: 1

Related Questions