user12223003
user12223003

Reputation:

BeautifulSoup4 find() function not working

I'm trying to extract Some_Product_Title from this block of HTML code

<div id="titleSection" class="a-section a-spacing-none">
        <h1 id="title" class="a-size-large a-spacing-none">
            <span id="productTitle" class="a-size-large">


                        Some_Product_Title


            </span>

The lines below are working fine

page = requests.get(URL, headers = headers)
soup = BeautifulSoup(page.content, 'html.parser')

But the code below is not

title = soup.find_all(id="productTitle")

Since when I try print(title) I get None as the console output

Does anyone know how to fix this?

Upvotes: 1

Views: 1688

Answers (3)

John Park
John Park

Reputation: 335

You're probably having trouble with .find() because the site from which you are creating the soup is, in all likelihood, generating its html code via javascript.

If this is the case, to find an element by id, you should implement the following:

soup1 = BeautifulSoup(page.content, "html.parser")
soup2 = BeautifulSoup(soup1.prettify(), "html.parser")
title = soup2.find(id = "productTitle")

Upvotes: 3

Hameda169
Hameda169

Reputation: 633

import requests
from bs4 import BeautifulSoup

URL = 'https://your-own.address/some-thing'
page = requests.get(URL, headers = headers)
soup = BeautifulSoup(page.content, 'html.parser')
title = soup.findAll('',{"id":"productTitle"})
print(*title)

Upvotes: 1

jmiller
jmiller

Reputation: 9

BS4 has CSS selectors built in so you can use: soup.select('#productTitle')

This would also work: title = soup.find_all("span", { "id" : "productTitle" })

Upvotes: 1

Related Questions