fahrradlaus
fahrradlaus

Reputation: 197

AttributeError while get text from html

I don't know what happend, but the same code was still working two days ago!

What I try to do is to get the text with itemprop = "name", which is the title of the offered item. In this case: "Swatch".

import requests
import bs4
response2 = requests.get('https://www.willhaben.at/iad/kaufen-und-verkaufen/d/swatch-209522646/').content

soup2 = bs4.BeautifulSoup(response2, "lxml")

texttitle = soup2.find(itemprop = "name").get_text().strip()
print(texttitle)

How ever I always get the AttributeError: 'NoneType' object has no attribute 'get_text' Could anyone explain me why I get the AttributeError? Many thanks in advance.

Edit:

I also tried to locate it directly with the css path, but that didn't gave me any result. By:

texttitle = soup2.find('div.adHeadingLine div.adHeading h1.header.cXenseParse').get_text().strip()

Upvotes: 1

Views: 322

Answers (2)

Dirty Penguin
Dirty Penguin

Reputation: 4412

The reason you're getting None back is because there is no element in that HTML page with a attribute called itemprop and its value set to name.

Looking at the source, there are definitely elements that use the itemprop attribute, such as:

<div itemprop='description' class="description">
    Batterie leer,ansonsten funktionsfähig!
</div>

<div itemprop='offers' itemscope itemtype='http://schema.org/Offer' class="container right">

But there are no elements like <div itemprop='name'>, and that's why you're getting None back.

@dmitriy is correct in that the most likely reason is the website was updated.

Upvotes: 1

Dmitriy Fialkovskiy
Dmitriy Fialkovskiy

Reputation: 3235

Error that you get tells that there's no such element on the page. Yesterday it could have been, but site's markup can change.

You can assure that an element, for which you give a condition, really exists:

from bs4 import BeautifulSoup
from urllib2 import urlopen

response = urlopen('https://www.willhaben.at/iad/kaufen-und-verkaufen/d/swatch-209522646/')
soup = BeautifulSoup(response, "lxml")

if soup.find(itemprop='name'):
    texttitle = soup.find(itemprop='name').text.strip()
    print(texttitle)
else:
    print('no such element') 

Upvotes: 1

Related Questions