Shivam
Shivam

Reputation: 21

Getting an AttributeError, when adding .text attribute

I have tried the script below and it works just fine:

from bs4 import BeautifulSoup
import requests 

pr= input("search: ")

source= requests.get('https://www.flipkart.com/search?q={}&otracker=search&otracker1=search&marketplace=FLIPKART&as-show=on&as=off'.format(pr)).content
soup = BeautifulSoup(source, 'html.parser')

url= soup.find_all('div', class_=('_3O0U0u'))
whole_product_list= []
whole_url_list= []
main_product_list= []
main_url_list= []
        

for i in url:
    tag_a_data= i.find_all('a')
    for l in tag_a_data:
        product_list= l.find('div', class_= '_3wU53n')

        if product_list:
            main_product_list.append(product_list.text)

        else:
            product_ok= l.get('title')
            main_product_list.append(product_ok)

print(main_product_list) 

so for example, if I pass "samsung" as input it returns a list for available attribute "div" with the given class Id, which is passed as arguments and if I pass something else as input like "shoes" which has "title" attribute it returns a list of all the titles available in it's html.

But if I reverse the order, like below:

from bs4 import BeautifulSoup
import requests 

pr= input("search: ")

source= requests.get('https://www.flipkart.com/search?q={}&otracker=search&otracker1=search&marketplace=FLIPKART&as-show=on&as=off'.format(pr)).content
soup = BeautifulSoup(source, 'html.parser')

url= soup.find_all('div', class_=('_3O0U0u'))
whole_product_list= []
whole_url_list= []
main_product_list= []
main_url_list= []
        


for i in url:
    tag_a_data= i.find_all('a')
    for l in tag_a_data:
        product_list = l.get('title')

        

        if product_list:
            main_product_list.append(product_list)

        else:
            product_ok= l.find('div', class_= '_3wU53n').text
            main_product_list.append(product_ok)    


print(main_product_list)

it starts giving an attribute error:

Traceback (most recent call last):
  File "tess.py", line 28, in <module>
    product_ok= l.find('div', class_= '_3wU53n').text
AttributeError: 'NoneType' object has no attribute 'text'

I'm not getting why the first script is working fine based on if-else operation but second is not.

Upvotes: 0

Views: 92

Answers (2)

Arpit Omprakash
Arpit Omprakash

Reputation: 118

Suppose you have the following data collected for your "l" values

  • item1 <title>title1</title><div class_= '_3wU53n'>xyz</div>
  • item2 <title>title1</title><div>xyz</div>
  • item3 <title>title1</title><div class_= '_3wU53n'>xyz</div>

Using the first code, your product_list variable will contain item1 and item3. Then you can get the title of the given items as they are available. So the code works without any problem.

Using the second code, your product_list variable will contain item1, item2, and item3. But in this case, you won't get the required div tag, as it doesn't exist for the second item. This causes the attribute error.

The simple thing is items in the database will always have a title, but most likely won't have the required div tag always.

The following change should get it working:

from bs4 import BeautifulSoup
import requests 

pr= input("search: ")

source= requests.get('https://www.flipkart.com/search?q={}&otracker=search&otracker1=search&marketplace=FLIPKART&as-show=on&as=off'.format(pr)).content
soup = BeautifulSoup(source, 'html.parser')

url= soup.find_all('div', class_=('_3O0U0u'))
whole_product_list= []
whole_url_list= []
main_product_list= []
main_url_list= []
        


for i in url:
    tag_a_data= i.find_all('a')
    for l in tag_a_data:
        product_list = l.get('title')

        

        if product_list:
            main_product_list.append(product_list)

        else:
            if l.find("div", class_='_3wU53n'):
                product_ok= l.find('div', class_= '_3wU53n').text
                main_product_list.append(product_ok)    


print(main_product_list)

Upvotes: 1

Mart&#237;n Nieva
Mart&#237;n Nieva

Reputation: 504

In this line:

product_ok= l.find('div', class_= '_3wU53n').text

l.find('div', class_= '_3wU53n') returns None, meaning it doesn't find the div. None values haven't got a text property, so it raises an AttributeError exception.

A fix would be to use the new walrus operator:

if product_ok := l.find('div', class_= '_3wU53n'):
    product_ok = product_ok.text

Upvotes: 2

Related Questions