extract email address using beautifulsoup (TypeError: 'int' object is not subscriptable)

Question

I have a quick issue with this part of my code. Basically I'm using beautifulsoup to scrap a website. I need to extract only the email address from a href tag which is inside a div with a class (see below):

And my code gives me this error: TypeError: 'int' object is not subscriptable

import requests
from bs4 import BeautifulSoup
import re

source_code = requests.get(item_url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text, "html.parser")

for link in soup.find('div', {'class': 'startup-email-link'}):
    href = link.find('a')['href']
    print(href)


    #href_final = re.compile('mailto')
    #print(href_final)

fodma1 · Accepted Answer

soup.find already returns a single tag, so no need to iterate on it. You can just get the link as

soup.find('div', {'class': 'startup-email-link'}).find('a')['href']

You may want to make it more robust in case the div with the class or the anchor tag is missing:

div = soup.find('div', {'class': 'startup-email-link'})
if div is None:
    return None
anchor = div.find('a')
if anchor is None:
    return None
return anchor['href']

Or you can use css selector if you prefer to keep it more concise:

selection = soup.select('div.startup-email-linak > a')
if not selection:
    return None
return selection[0]['href']

extract email address using beautifulsoup (TypeError: 'int' object is not subscriptable)

Answers (2)

Related Questions

extract email address using beautifulsoup (TypeError: &#39;int&#39; object is not subscriptable)

Answers (2)

Related Questions

extract email address using beautifulsoup (TypeError: 'int' object is not subscriptable)