Maria Georgali
Maria Georgali

Reputation: 659

Cannot get text of a span attribute using BeautifulSoup

I am trying to get from the following

<span id="SkuNumber" itemprop="identifier" content="sku:473768" data-nodeid="176579" class="product-code col-lg-4 col-md-4">ΚΩΔ. 473768</span></div>

the value of data-nodeid I did the following

price_nodes = soup.find('span', attrs={'id': 'SkuNumber'})
datanode = price_nodes.select_one('span[data-nodeid]')

But I get "None" How can I fix this? thank you

Upvotes: 1

Views: 71

Answers (2)

Daan Klijn
Daan Klijn

Reputation: 1694

from bs4 import BeautifulSoup

html = '<span id="SkuNumber" itemprop="identifier" content="sku:473768" data-nodeid="176579" class="product-code col-lg-4 col-md-4">ΚΩΔ. 473768</span></div>'
soup = BeautifulSoup(html)

price_nodes = soup.find('span', attrs={'id': 'SkuNumber'})
print(price_nodes['data-nodeid'])

Upvotes: 1

Maaz
Maaz

Reputation: 2445

If price_nodes is correctly fill

i.e. price_nodes =

<span id="SkuNumber" itemprop="identifier" content="sku:473768" data-nodeid="176579" class="product-code col-lg-4 col-md-4">ΚΩΔ. 473768</span>

You just have to do this:

datanode = price_nodes.get('data-nodeid')

Full code should be:

from bs4 import BeautifulSoup as soup

html = '<div><span id="SkuNumber" itemprop="identifier" content="sku:473768" data-nodeid="176579" class="product-code col-lg-4 col-md-4">ΚΩΔ. 473768</span></div>'
page = soup(html, 'html.parser')
price_nodes = page.find('span', {'id': 'SkuNumber'})
datanode = price_nodes.get('data-nodeid')

Upvotes: 2

Related Questions