Reputation: 10074
Im scraping a page using Kimono and Ive come across some data that is structured as below.
The issue is that all of the data is stored in an element called <div class="agents-stats-seperator">
some entries only have one of these elements, some have up to 4.
There is different data in each of them that im trying to scrape and the only structured differential between them is the Text, either :
Im Kimono you have the option to define what you want to select either by css path or regex.
At the moment im defining with the below :
div > div > div > div.agents-stats-seperator > div
/^()(.*?)()$/
Which is causing an issue as it picking up all the <div class="agents-stats-seperator">
elements, what ive been stuck on is how to set the regular expression to target jsut the elements that contain the text Residential for sale:
Ive tried using :
div > div > div > div.agents-stats-seperator > div [str="Residential to rent:"]
/^()(.*?)()$/
But to no avail, any ideas ?
For reference here is a snippet of the html
<div class="clearfix top agents-stats bg-muted">
<div class="agents-stats-seperator">
<div class="agents-stats-l">
Residential for sale:
<strong><a href="/for-sale/branch/1-click-homes-london-19269/">14</a></strong>
</div>
<div class="agents-stats-c">
Avg. asking price:
<strong class="price">£447,143</strong>
</div>
<div class="agents-stats-r">
Avg. sale listing age:
<span>18 weeks</span>
</div>
</div>
<div class="agents-stats-seperator">
<div class="agents-stats-l">
Residential to rent:
<strong><a href="/to-rent/branch/1-click-homes-london-19269/">9</a></strong>
</div>
<div class="agents-stats-c">
Avg. asking rent:
<strong class="price">£1,660 pcm</strong>
</div>
<div class="agents-stats-r">
Avg. rental listing age:
<span>3 weeks</span>
</div>
</div>
<div class="agents-stats-seperator">
<div class="agents-stats-l">
Commercial for sale
<strong><a href="/for-sale/commercial/branch/1-click-homes-london-19269/">1</a></strong>
</div>
<div class="agents-stats-c">
Avg. asking price:
<strong class="price">£700,000</strong>
</div>
<div class="agents-stats-r">
Avg. sale listing age:
<span>11 weeks</span>
</div>
</div>
<div class="agents-stats-seperator">
<div class="agents-stats-l">
Commercial to let
<strong><a href="/to-rent/commercial/branch/1-click-homes-london-19269/">1</a></strong>
</div>
<div class="agents-stats-c">
Avg. asking rent:
<strong class="price">£22,516 pa</strong>
</div>
<div class="agents-stats-r">
Avg. rental listing age:
<span>56 weeks</span>
</div>
</div>
</div>
Upvotes: 1
Views: 639
Reputation: 11
Try something like :
div:nth-child(1).agents-stats-seperator > div:nth-child(1).agents-stats-l > strong > a
Upvotes: 0