Pulling Text from Type 'Navigable String' and 'Tag' on Beautiful Soup

Question

I'm stuck on parsing part of Rotten Tomatoes website that has the critics score as a tag and the "%" separately. I followed some SO suggestions such as using find_all('span',text="true"), but Python 3.5.1 shell returned this error: AttributeError: 'NavigableString' object has no attribute 'find_all' I also tried finding the direct child of Beautiful Soup object critiscore, but received the same error. Please tell me where I went wrong. Here's my python code:

def get_rating(address):
    """pull ratings numbers from rotten tomatoes"""
    RTaddress = urllib.request.urlopen(address)
    tomatoe = BeautifulSoup(RTaddress, "lxml")
    for criticscore in tomatoe.find('span', class_=['meter-value superPageFontColor']):
        print(''.join(criticscore.find_all('span', recursive=False))) #print the Tomatometer

Also, here's the code on Rotten Tomatoes I'm interested in scraping:

96%

alecxe · Accepted Answer

The problem line is this one:

for criticscore in tomatoe.find('span', class_=['meter-value superPageFontColor']):

Here, you are locating a single element via find() and then iterate over its children which can be the text nodes as well as other elements (when you iterate over an element, this is what happens in BeautifulSoup).

Instead, you probably meant to use find_all() instead of find():

for criticscore in tomatoe.find_all('span', class_=['meter-value superPageFontColor']):

Or, you can use a single CSS selector instead:

for criticscore in tomatoe.select('span.meter-value > span'):
    print(criticscore.get_text())

where > means a direct parent-child relationship (this is your recursive=False replacement).

Pulling Text from Type 'Navigable String' and 'Tag' on Beautiful Soup

Answers (1)

Related Questions

Pulling Text from Type &#39;Navigable String&#39; and &#39;Tag&#39; on Beautiful Soup

Answers (1)

Related Questions

Pulling Text from Type 'Navigable String' and 'Tag' on Beautiful Soup