Only extract information from the div class if it contains a certain word using xpath

Question

I am trying to scrape information from the following website https://www.rawson.co.za

However, sometimes, the information changes it's position. I am struggling to check for only the 'Building size' and store that as the size, since the div class looks like this:


            
        Building Size 130m²

I am able to extract that but sometimes it takes other information due to the property either not having it or something else being at the position of it.

This is what i have for size now (I am accessing the information from the child/property pages):

size = response.xpath("//div[@class='features']/div[@class='features__list']/div[@class='row']/div[@class='col col--1-2'][2]/div[@class='features__item'][1]/div[@class='features__label']/text()").re(r'\d+')[0]

What I would like to take is the Building size information(only numbers) if it exists and put None if there is no building size available. I am struggling with the text part in the div class. I have tried to construct a for loop that will check if it contains the ''Building Size'' but nothing has worked yet. Any help would be very much appreciated! Thank you!

gangabass · Accepted Answer

Simple:

size = response.xpath("//div[@class='features__label'][contains(., 'Building Size')]/text()").re_first(r'\d+')

Only extract information from the div class if it contains a certain word using xpath

Answers (1)

Related Questions