Keep getting 'TypeError: 'NoneType' object is not callable' with beautiful soup and python3

Question

I am a beginner and struggling though a course, so this problem is probably really simple, but I am running this (admittedly messy) code (saved under file x.py) to extract a link and a name from a website with line formats like:


  Prabhjoit

So I set up this: import urllib.request, urllib.parse, urllib.error from bs4 import BeautifulSoup import ssl # Ignore SSL certificate errors ctx = ssl.create_default_context() ctx.check_hostname = False ctx.verify_mode = ssl.CERT_NONE

url = input('Enter - ')
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')
for line in soup:
    if not line.startswith('')
    count = count + 1
    if count == 18:
        break
print(name[1])
print(link)

And it keeps producing the error:

Traceback (most recent call last):
  File "x.py", line 15, in 
    if not line.startswith('



I have struggled with this for hours, and I would be grateful for any suggestions.

Martijn Pieters · Accepted Answer

line is not a string, and it has no startswith() method. It is a BeautifulSoup Tag object, because BeautifulSoup has parsed the HTML source text into a rich object model. Don't try to treat it as text!

The error is caused because if you access any attribute on a Tag object that it doesn't know about, it does a search for a child element with that name (so here it executes line.find('startswith')), and since there is no element with that name, None is returned. None.startswith() then fails with the error you see.

If you wanted to find the 18th

element, just ask BeautifulSoup for that specific element:

soup = BeautifulSoup(html, 'html.parser')
li_link_elements = soup.select('li a[href]', limit=18)
if len(li_link_elements) == 18:
    last = li_link_elements[-1]
    print(last.get_text())
    print(last['href'])

This uses a CSS selector to find only the link elements whose parent is a

element and that have a href attribute. The search is limited to just 18 such tags, and the last one is printed, but only if we actually found 18 in the page.

The element text is retrieved with the Element.get_text() method, which will include text from any nested elements (such as or or other extra markup), and the href attribute is accessed using standard indexing notation.

Keep getting 'TypeError: 'NoneType' object is not callable' with beautiful soup and python3

Answers (1)

Related Questions

Keep getting &#39;TypeError: &#39;NoneType&#39; object is not callable&#39; with beautiful soup and python3

Answers (1)

Related Questions

Keep getting 'TypeError: 'NoneType' object is not callable' with beautiful soup and python3