Reputation: 11813
I unfortunately have to scrape a webpage, and I'm doing so via Google Docs.
The document looks like this:
<div class='search'>
<div class='new'>
<img src="product1.png" title="Product 1 - €2.40"/>
</div>
<div class='new dupe'> <!-- this one appears dimmed: there's a better offer -->
<!-- I don't want these in my results -->
<img src="product1.png" title="Product 1 - €2.70"/>
</div>
</div>
The current xPath looks like this:
//div[@class='search']//@title
How can I modify it? I could do
//div[@class='search']//div[not(@class='dupe')]//@title
...but that won't work because no item actually has the list of class
es being exactly 'dupe'
.
Upvotes: 0
Views: 203
Reputation: 32094
/div[@class='search']/div[not(contains(@class, 'dupe')]//@title
I would try to avoid using //
and be more specific:
/div[@class='search']/div[not(contains(@class, 'dupe')]/img/@title
Upvotes: 4