kevlar1818
kevlar1818

Reputation: 3125

Basic Python/Beautiful Soup Parsing

Say I have use

date = r.find('abbr')

to get

<abbr class="dtstart" title="2012-11-16T00:00:00-05:00">November 16, 2012</abbr>

I just want to print November 16, 2012, but if I try

print date.string

I get

AttributeError: 'NoneType' object has no attribute 'string'

What am I doing wrong?

UPDATE: Here's my code Neither of the print pairs print the raw string, but the uncommented ones get the correct tags

from BeautifulSoup import BeautifulSoup
page = urllib2.urlopen("some-url-path")
soup = BeautifulSoup(page)
calendar = soup.find('table',{"class" : "vcalendar ical"})
for r in calendar.findAll('tr'):
#   print ''.join(r.findAll('abbr',text=True))
#   print ''.join(r.findAll('strong',text=True))
    print r.find('abbr')
    print r.find('strong')

Upvotes: 3

Views: 2155

Answers (2)

Acorn
Acorn

Reputation: 50497

soup.find('abbr').string should work fine. There must be something wrong with date.

from BeautifulSoup import BeautifulSoup

doc = '<abbr class="dtstart" title="2012-11-16T00:00:00-05:00">November 16, 2012</abbr>'

soup = BeautifulSoup(doc)

for abbr in soup.findAll('abbr'):
    print abbr.string

Result:

November 16, 2012

Update based on code added to question:

You can't use the text parameter like that.

http://www.crummy.com/software/BeautifulSoup/documentation.html#arg-text

text is an argument that lets you search for NavigableString objects instead of Tags

Either you're looking for text nodes, or you're looking for tags. A text node can't have a tag name.

Maybe you want ''.join([el.string for el in r.findAll('strong')])?

Upvotes: 3

unutbu
unutbu

Reputation: 879083

The error message is saying that date is None. You haven't shown enough code to say why that is so. Indeed, using the code you posted in the most straight-forward way should work:

import BeautifulSoup

content='<abbr class="dtstart" title="2012-11-16T00:00:00-05:00">November 16, 2012</abbr>'
r=BeautifulSoup.BeautifulSoup(content)
date=r.find('abbr')
print(date.string)
# November 16, 2012

Upvotes: 0

Related Questions