badp
badp

Reputation: 11813

How can I build an xpath that matches all div's in a page that do NOT have a certain class?

I unfortunately have to scrape a webpage, and I'm doing so via Google Docs.

The document looks like this:

<div class='search'>
 <div class='new'>
  <img src="product1.png" title="Product 1 - €2.40"/>
 </div>
 <div class='new dupe'> <!-- this one appears dimmed: there's a better offer -->
                        <!-- I don't want these in my results -->
  <img src="product1.png" title="Product 1 - €2.70"/>
 </div>
</div>

The current xPath looks like this:

//div[@class='search']//@title

How can I modify it? I could do

//div[@class='search']//div[not(@class='dupe')]//@title

...but that won't work because no item actually has the list of classes being exactly 'dupe'.

Upvotes: 0

Views: 203

Answers (1)

newtover
newtover

Reputation: 32094

/div[@class='search']/div[not(contains(@class, 'dupe')]//@title

I would try to avoid using // and be more specific:

/div[@class='search']/div[not(contains(@class, 'dupe')]/img/@title

Upvotes: 4

Related Questions