Reputation: 31

parse html beautiful soup

I have a html page

<a email="[email protected]" href="http://www.max.ru/agent?message&[email protected]" title="Click herе" class="mf_spIco spr-mrim-9"></a><a class="mf_t11" type="booster" href="http://max.ru/mail/corporate/">

I neeed a parse email string

    soup = BeautifulSoup(data
    string = soup.find("a",{"email": ""})
    print string

But it not working. Where mistake?

Upvotes: 0

Answers (1)

Day

Reputation: 9703

Your mistake was in using the attrs dict to look for elements with an email attribute that is empty. Try this instead.

#!/usr/bin/env python

from BeautifulSoup import BeautifulSoup
import urllib2

req = urllib2.urlopen('http://worldnuclearwar.ru')

soup = BeautifulSoup(req)
print soup.find("a", email=True)["email"]

To print the email attribute of the first a element which has an email attribute. If you want all emails, try

for link in soup.findAll("a", email=True):
    print link["email"]

Upvotes: 4

parse html beautiful soup

Answers (1)

Related Questions