Oluchi A
Oluchi A

Reputation: 11

"NoneType" Error On Specific Table Using Beautiful Soup

I have this website: https://erowid.org/general/big_chart.shtml I am trying to extract the names of each drug by using beautiful soup to access their tables.

This code works perfectly:

chem_table = bs.find('table', id="section-CHEMICALS")
for row in chem_table.find_all('tr'):
    print(row.find('a').contents[0])

this code gives me the following error, despite both tables being the same format:

plants_table = bs.find('table', id="section-PLANTS")
for r in plants_table.find_all('tr'):
    print(r.find('a').contents[0])

This is the error I get for the second block: AttributeError: 'NoneType' object has no attribute 'contents'

However, 'print(r.find('a'))' works perfectly.

I tried to see if 'a' existed by running

 r.find('a'),

which gave the correct results. Then tried

r.find('a').text,

which again gave me a NoneType error.

Upvotes: 1

Views: 111

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195553

The first row in the PLANTS table doesn't contain any <a> tag so you need to check for that:

import requests
from bs4 import BeautifulSoup

url = 'https://erowid.org/general/big_chart.shtml'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')

plants_table = soup.select_one('#section-PLANTS')
for r in plants_table.select('tr:has(a)'):
    print(r.find('a').text)

Prints:

...
tobacco
virola
voacanga_africana
wormwood
yerba_mate
yohimbe

plants_table.select('tr:has(a)') is using CSS selector tr:has(a) which selects all <tr> tags containing at least one <a> tag.

Upvotes: 2

Related Questions