Reputation: 63
When I'm trying to know the Title of Sony
Headset using the below code, the result of code is None
.
import requests
from bs4 import BeautifulSoup
URL = 'https://www.amazon.com/Sony-Noise-Cancelling-Headphones-
WH1000XM3/dp/B07G4MNFS1/ref=sxin_0_ac_d_rm?ac_md=0-0-c29ueQ%3D%3D-
ac_d_rm&keywords=sony&pd_rd_i=B07G4MNFS1&pd_rd_r=3e6d5325-8ee4-4ba8-a84f-
1b7cf2ce98bf&pd_rd_w=BVSFq&pd_rd_wg=I0LMZ&pf_rd_p=e2f20af2-9651-42af-9a45-
89425d5bae34&pf_rd_r=VGT25BXXZNDE3B61A994&psc=1&qid=1577253649&smid=ATVPDKIKX0DER'
headers = {"User-Agent":"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like
Gecko) Chrome/79.0.3945.88 Safari/537.36"}
page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content, "html.parser")
soup.prettify()
#print(soup)
title = soup.find_all('span', {'id':'productTitle'})
print(title, len(title))
Current Output is :
[ ] 0
Upvotes: 0
Views: 403
Reputation: 4803
I spent the last two hours trying to scrape that title with BeautifulSoup. I tried scraping other elements on the page. No success. I tried sending the raw content to file and that broke due to the presence of strange characters.
I tried Ahmed's answer and still got none. I tried a bunch of other solutions I found online and still got none. I can't for the life of me figure out how to use BeautifulSoup
to scrape this.
I know you use Selenium, so here is the Selenium solution.
from selenium import webdriver
bot = webdriver.Chrome()
bot.get("https://www.amazon.com/Sony-Noise-Cancelling-Headphones-WH1000XM3/dp/B07G4MNFS1/ref=sxin_0_ac_d_rm?ac_md=0-0-c29ueQ==-ac_d_rm&keywords=sony&pd_rd_i=B07G4MNFS1&pd_rd_r=3e6d5325-8ee4-4ba8-a84f-1b7cf2ce98bf&pd_rd_w=BVSFq&pd_rd_wg=I0LMZ&pf_rd_p=e2f20af2-9651-42af-9a45-89425d5bae34&pf_rd_r=VGT25BXXZNDE3B61A994&psc=1&qid=1577253649&smid=ATVPDKIKX0DER")
title = bot.find_element_by_id('productTitle').text
print(title)
bot.close()
Upvotes: 1
Reputation: 11525
import requests
from bs4 import BeautifulSoup
r = requests.get("https://www.amazon.com/Sony-Noise-Cancelling-Headphones-WH1000XM3/dp/B07G4MNFS1/ref=sxin_0_ac_d_rm?ac_md=0-0-c29ueQ==-ac_d_rm&keywords=sony&pd_rd_i=B07G4MNFS1&pd_rd_r=3e6d5325-8ee4-4ba8-a84f-1b7cf2ce98bf&pd_rd_w=BVSFq&pd_rd_wg=I0LMZ&pf_rd_p=e2f20af2-9651-42af-9a45-89425d5bae34&pf_rd_r=VGT25BXXZNDE3B61A994&psc=1&qid=1577253649&smid=ATVPDKIKX0DER")
soup = BeautifulSoup(r.text, 'html.parser')
for item in soup.findAll("span", {'id': 'productTitle'}):
print(item.get_text(strip=True))
Output:
Sony Noise Cancelling Headphones WH1000XM3: Wireless Bluetooth Over the Ear Headphones with Mic and Alexa voice control - Industry Leading Active Noise Cancellation - Black
Run Code Online: Click Here
Upvotes: 1