Reputation: 1032
I'm trying to find all div
s whose class
name is 'phrase' and parent node's class
name is not 'extras'.
So in Python I'm using
for phrase in entry.iterfind(".//div[@class='phrase'] and ./parent::div[@class!='extras']]"):
to do that.
But it gives me the error:
SyntaxError: prefix 'parent' not found in prefix map
And I changed the above code to
for phrase in entry.iterfind(".//div[@class='phrase'] and ./..[@class!='extras']]"):
This time the error was
Traceback (most recent call last):File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/xml/etree/ElementPath.py", line 272, in iterfind
selector = _cache[cache_key] KeyError: (".//div[@class='phrase'] and ./..[@class!='extras']]", None)
Part of the XML structures are as follows:
<div class="phrases">
<div class="label">Phrases</div>
<div class="phrase">
……
<div class="phrasal verbs">
<div class="label">Phrases</div>
<div class="phrase">
……
<div class="extras">
<h2>test test</h2>
<div class="phrase">
……
I'm using Python 3.7 and xml.etree
library on Mac OS 10.14.
Upvotes: 1
Views: 759
Reputation: 52665
Problem might be in your current tool as it might not support some XPath syntax.
You can try lxml.html to parse the same HTML-doc:
from lxml import html
source = """<div class="phrases">
<div class="label">Phrases</div>
<div class="phrase">this</div>
</div>
<div class="phrasal verbs">
<div class="label">Phrases</div>
<div class="phrase">this</div>
</div>
<div class="extras">
<h2>test test</h2>
<div class="phrase">not this</div>
</div>"""
dom = html.fromstring(source)
dom.xpath(".//div[@class='phrase' and ./parent::div[@class!='extras']]")
Output:
[<Element div at 0x7fb5218d5db8>, <Element div at 0x7fb521018728>] # Two elements found
or
dom.xpath(".//div[@class='phrase' and ./parent::div[@class!='extras']]/text()")
Output:
['this', 'this']
Upvotes: 1
Reputation: 171
you can use something like "//div[@class!='extras']/div[@class='phrase']"
it should find all div's with class 'phrase' where parent class is not 'extras'
Upvotes: 0