Reputation: 21
I have tried the script below and it works just fine:
from bs4 import BeautifulSoup
import requests
pr= input("search: ")
source= requests.get('https://www.flipkart.com/search?q={}&otracker=search&otracker1=search&marketplace=FLIPKART&as-show=on&as=off'.format(pr)).content
soup = BeautifulSoup(source, 'html.parser')
url= soup.find_all('div', class_=('_3O0U0u'))
whole_product_list= []
whole_url_list= []
main_product_list= []
main_url_list= []
for i in url:
tag_a_data= i.find_all('a')
for l in tag_a_data:
product_list= l.find('div', class_= '_3wU53n')
if product_list:
main_product_list.append(product_list.text)
else:
product_ok= l.get('title')
main_product_list.append(product_ok)
print(main_product_list)
so for example, if I pass "samsung" as input it returns a list for available attribute "div" with the given class Id, which is passed as arguments and if I pass something else as input like "shoes" which has "title" attribute it returns a list of all the titles available in it's html.
But if I reverse the order, like below:
from bs4 import BeautifulSoup
import requests
pr= input("search: ")
source= requests.get('https://www.flipkart.com/search?q={}&otracker=search&otracker1=search&marketplace=FLIPKART&as-show=on&as=off'.format(pr)).content
soup = BeautifulSoup(source, 'html.parser')
url= soup.find_all('div', class_=('_3O0U0u'))
whole_product_list= []
whole_url_list= []
main_product_list= []
main_url_list= []
for i in url:
tag_a_data= i.find_all('a')
for l in tag_a_data:
product_list = l.get('title')
if product_list:
main_product_list.append(product_list)
else:
product_ok= l.find('div', class_= '_3wU53n').text
main_product_list.append(product_ok)
print(main_product_list)
it starts giving an attribute error:
Traceback (most recent call last):
File "tess.py", line 28, in <module>
product_ok= l.find('div', class_= '_3wU53n').text
AttributeError: 'NoneType' object has no attribute 'text'
I'm not getting why the first script is working fine based on if-else operation but second is not.
Upvotes: 0
Views: 92
Reputation: 118
Suppose you have the following data collected for your "l" values
<title>title1</title><div class_= '_3wU53n'>xyz</div>
<title>title1</title><div>xyz</div>
<title>title1</title><div class_= '_3wU53n'>xyz</div>
Using the first code, your product_list
variable will contain item1 and item3. Then you can get the title
of the given items as they are available. So the code works without any problem.
Using the second code, your product_list
variable will contain item1, item2, and item3. But in this case, you won't get the required div
tag, as it doesn't exist for the second item. This causes the attribute error.
The simple thing is items in the database will always have a title, but most likely won't have the required div
tag always.
The following change should get it working:
from bs4 import BeautifulSoup
import requests
pr= input("search: ")
source= requests.get('https://www.flipkart.com/search?q={}&otracker=search&otracker1=search&marketplace=FLIPKART&as-show=on&as=off'.format(pr)).content
soup = BeautifulSoup(source, 'html.parser')
url= soup.find_all('div', class_=('_3O0U0u'))
whole_product_list= []
whole_url_list= []
main_product_list= []
main_url_list= []
for i in url:
tag_a_data= i.find_all('a')
for l in tag_a_data:
product_list = l.get('title')
if product_list:
main_product_list.append(product_list)
else:
if l.find("div", class_='_3wU53n'):
product_ok= l.find('div', class_= '_3wU53n').text
main_product_list.append(product_ok)
print(main_product_list)
Upvotes: 1
Reputation: 504
In this line:
product_ok= l.find('div', class_= '_3wU53n').text
l.find('div', class_= '_3wU53n')
returns None
, meaning it doesn't find the div. None
values haven't got a text
property, so it raises an AttributeError
exception.
A fix would be to use the new walrus operator:
if product_ok := l.find('div', class_= '_3wU53n'):
product_ok = product_ok.text
Upvotes: 2