saraherceg
saraherceg

Reputation: 341

Only extract information from the div class if it contains a certain word using xpath

I am trying to scrape information from the following website https://www.rawson.co.za

However, sometimes, the information changes it's position. I am struggling to check for only the 'Building size' and store that as the size, since the div class looks like this:

<div class="features__item">
            <div class="features__icon icon-house" aria-hidden="true"></div>
        <div class="features__label">Building Size 130m²</div>
</div>

I am able to extract that but sometimes it takes other information due to the property either not having it or something else being at the position of it.

This is what i have for size now (I am accessing the information from the child/property pages):

size = response.xpath("//div[@class='features']/div[@class='features__list']/div[@class='row']/div[@class='col col--1-2'][2]/div[@class='features__item'][1]/div[@class='features__label']/text()").re(r'\d+')[0]

What I would like to take is the Building size information(only numbers) if it exists and put None if there is no building size available. I am struggling with the text part in the div class. I have tried to construct a for loop that will check if it contains the ''Building Size'' but nothing has worked yet. Any help would be very much appreciated! Thank you!

Upvotes: 1

Views: 218

Answers (1)

gangabass
gangabass

Reputation: 10666

Simple:

size = response.xpath("//div[@class='features__label'][contains(., 'Building Size')]/text()").re_first(r'\d+')

Upvotes: 2

Related Questions