Reputation: 15
I want to get the product title from an amazon url-https://www.amazon.in/BATA-Fenny-Sneakers-7-India-8219990/dp/B07P8PMS25/ref=asc_df_B07P8PMS25/?tag=googleshopdes-21&linkCode=df0&hvadid=397006879402&hvpos=&hvnetw=g&hvrand=2284563689588211961&hvpone=&hvptwo=&hvqmt=&hvdev=c&hvdvcmdl=&hvlocint=&hvlocphy=1007824&hvtargid=pla-837374864561&psc=1&ext_vrnc=hi. I tried this code
from bs4 import *
import requests
head={'user-agent':'betbargain/android-7.0/0.0.5'}
amaurl=input('Enter amazon url')
amazinfo=requests.get(amaurl,headers=head)
amasoup=BeautifulSoup(amazinfo.text,'lxml')
amatit=amasoup.find("span", attrs={"id":'productTitle'}).string.strip()
print(amatit)
But when I input the url it says-
Traceback (most recent call last):
File "c:/Users/rauna/Desktop/bb.py", line 7, in <module>
amatit=amasoup.find("span", attrs={"id":'productTitle'}).string.strip()
AttributeError: 'NoneType' object has no attribute 'string'
I have no idea why this has happened. Please tell me where I am wrong. Thanks in advance.
Upvotes: 0
Views: 680
Reputation: 195408
Change the search to <h1>
with id="title"
:
from bs4 import *
import requests
head={'user-agent':'betbargain/android-7.0/0.0.5'}
# amaurl=input('Enter amazon url')
amaurl = 'https://www.amazon.in/BATA-Fenny-Sneakers-7-India-8219990/dp/B07P8PMS25/ref=asc_df_B07P8PMS25/?tag=googleshopdes-21&linkCode=df0&hvadid=397006879402&hvpos=&hvnetw=g&hvrand=2284563689588211961&hvpone=&hvptwo=&hvqmt=&hvdev=c&hvdvcmdl=&hvlocint=&hvlocphy=1007824&hvtargid=pla-837374864561&psc=1&ext_vrnc=hi'
amazinfo=requests.get(amaurl,headers=head)
amasoup=BeautifulSoup(amazinfo.text,'lxml')
amatit=amasoup.find("h1", attrs={"id":'title'}).get_text(strip=True) # <-- change to <h1 id="title">
print(amatit)
Prints:
BATA Men's Fenny Sneakers
Upvotes: 1
Reputation: 3908
If find
does not find anything, it will return None
. Have a look at the documentation. You have to check if find
has found anything before doing something with the result.
Also instead of .string
, I think you want str()
.
Try this instead:
from bs4 import *
import requests
head = {'user-agent':'betbargain/android-7.0/0.0.5'}
amaurl = input('Enter amazon url')
amazinfo = requests.get(amaurl,headers=head)
amasoup = BeautifulSoup(amazinfo.text,'lxml')
findResult = amasoup.find("span", attrs={"id":'productTitle'})
amatit = ""
if (findResult != None): # Added a check for findResult being None
amatit = str(findResult).strip() # Changed .string to str()
print(amatit)
Upvotes: 1
Reputation: 2721
try this:
from bs4 import *
import requests
head={'user-agent':'betbargain/android-7.0/0.0.5'}
amaurl=input('Enter amazon url')
amazinfo=requests.get(amaurl,headers=head)
amasoup=BeautifulSoup(amazinfo.text,'lxml')
amatit=str(amasoup.find("span", attrs={"id":'productTitle'})).strip()
print(amatit)
Upvotes: 0