najoza
najoza

Reputation: 3

"Extract text within div tag" but gives error

Here's a piece of code I would like to extract the text "Available at Amazon starting at $200".Here's what I've written so far

import re
Cost_List = []
for result in stroller_result:
    if result.find('div', class_ = 'product__retailer__text') is not None:
        Cost = result.find_all('div', {'class': ["product__retailer__text", re.compile(r'\$\d+(?:\.\d+)?')]})
    Cost_List.append(Cost)

This code returns this:

[[<div class="product__retailer__text">Available at Amazon starting at $200</div>],
 [<div class="product__retailer__text">Available at Amazon starting at $200</div>],
 [<div class="product__retailer__text">Available at Amazon starting at $500</div>],
 [<div class="product__retailer__text">Available at Amazon starting at $100</div>],
 [<div class="product__retailer__text">Available at Amazon starting at $100</div>],

If I try to extract the text I do get this error: 'Available at Amazon starting at $200' using .text I get the following error - AttributeError: ResultSet object has no attribute 'text'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?

import re
Cost_List = []
for result in stroller_result:
    if result.find('div', class_ = 'product__retailer__text') is not None:
        Cost = result.find_all('div', {'class': ["product__retailer__text", re.compile(r'\$\d+(?:\.\d+)?')]}).text
    Cost_List.append(Cost)

Upvotes: 0

Views: 38

Answers (2)

najoza
najoza

Reputation: 3

This code worked :

import re
Cost_List = []
for result in stroller_result:
    if result.find('div', class_ = 'product__retailer__text') is not None:
        Cost = [x.text for x in result.find_all('div', {'class': ["product__retailer__text", re.compile(r'\$\d+(?:\.\d+)?')]})]
    Cost_List.append(Cost)

Upvotes: 0

Mitchell Olislagers
Mitchell Olislagers

Reputation: 1817

To get rid of your error, you can call .text on each element, rather than on the list of elements.

import re
Cost_List = []
for result in stroller_result:
    if result.find('div', class_ = 'product__retailer__text') is not None:
        Cost = result.find_all('div', {'class': ["product__retailer__text", re.compile(r'\$\d+(?:\.\d+)?')]})
        Cost = [el.text for el in Cost]
        Cost_List.append(Cost)

Upvotes: 0

Related Questions