Reputation: 131
How can i exclude element to be scraped using contains with OR my current xpath that i use is not working. //div/li[contains(text(), 'Night') OR contains(text(), 'Big')
Upvotes: 0
Views: 138
Reputation: 5905
To complete @Sergii Dmytrenko's answer, use also a lowercase or
operator.
//div/li[contains(text(), 'Night') or contains(text(), 'Big')]
The preceding XPath will output li
elements containing the text "Night" or "Big" (case sensitive).
In order to exclude elements, you can use the not
operator as previoulsy described.
Side note : using !=
(not equal) with and
operator is also possible to exclude elements :
//div/li[text()!='Night' and text()!='Big']
This will exclude elements which strictly contain (no more text) "Night" or "Big".
EDIT : Assuming you have :
<div>
<h2>Night of the living dead</h2>
<h2>Big fish</h2>
<h2>Save the last dance</h2>
<h2>Tomorrow never die</h2>
<h2>Australia nuclear war</h2>
</div>
To select elements which don't contain "Night","Big", or "Australia", you have two options :
Using or
operators inside a not
condition :
//div/h2[not(contains(text(),'Night') or contains(text(),'Big') or contains(text(),'Australia'))]
Using multiple not
with and operators :
//div/h2[not(contains(text(),'Night')) and not(contains(text(),'Big')) and not(contains(text(),'Australia'))]
Output : 2 nodes :
Save the last dance
Tomorrow never die
Upvotes: 1
Reputation: 301
Your XPath expression (if corrected the typos: li[contains(text(), 'Night') or contains(text(), 'Big')]
) will return li
elements having the text "Night" or "Big".
to exclude these the correct expression should be
//div/li[not(contains(text(), 'Night') or contains(text(), 'Big'))]
or you may try
//div/li[not(contains(text(), 'Night')) and not(contains(text(), 'Big'))]
Upvotes: 1
Reputation: 176
Your xpath should end with ']', currently it is invalid one.
If you would like to exclude 'Night' and 'Big' you may try this:
//div/li[not(contains(text(), 'Night') OR contains(text(), 'Big'))]
Upvotes: 0