Reputation:
Lets say I have html:
<body>
<div class="items">
<span class="label">label1</span>
<div class="value">value1</div>
</div>
<div class="items">
<span class="label">label2</span>
<div class="value">
<a class="link">value2</a>
</div>
</div>
<div class="items">
<span class="label">label3</span>
<div class="value">
<a class="link">value3</a>
</div>
</div>
<div class="items">
<span class="label">label4</span>
<div class="value">value4</div>
</div>
</body>
Im trying to get text from <a class="link">
if possible or from <div class=value>
.
for result in response.xpath("//div[@class='items']"):
label = result.xpath(".//span[@class='label']//text()").extract_first()
# here Im trying use or operation to get
# a text if possible or div text
value = result.xpath(".//a[@class='link']//text()"
"|.//div[@class='value']//text()").get()
print(label, value)
Results:
label1 value1
label2
label3
label4 value4
This code assign only text from <div class='value'>
although <a class='link'>
exist.
What I need?
I would like to xpath code return a
text if possible in otherwise it should take div
text.
Upvotes: 1
Views: 167
Reputation: 243509
Im trying to get text from
<a class="link">
if possible or from<div> class=value>
Here is a simple / short XPath 1.0 expression that selects exactly all the wanted text nodes:
(//div[@class='value'] | //a[@class='link'])/text()
XSLT 1.0 - based verification:
This transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:for-each select="(//div[@class='value'] | //a[@class='link'])/text()">
<xsl:if test="not(position() = 1)">, </xsl:if>
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
evaluates the XPath expression and outputs each selected text-node using convenient delimiters.
The wanted result is produced:
value1, value2, value3, value4
Upvotes: 0
Reputation: 14135
Here is the xpath that you should use.
//div[@class='items'][2]//div[@class='value']/a|//div[@class='items'][2]//div[@class='value'][not(a)]
So replace this in your code.
value = result.xpath(".//div[@class='value']/a/text()|.//div[@class='value'][not(a)]/text()").get()
Upvotes: 0