user10833069
user10833069

Reputation:

Xpath boolean OR (equivalent to Python's A | B: return A if non-empty, otherwise B)

Lets say I have html:

<body>
  <div class="items">
    <span class="label">label1</span>
    <div class="value">value1</div>
  </div>

  <div class="items">
    <span class="label">label2</span>
    <div class="value">
      <a class="link">value2</a>
    </div>
  </div>

  <div class="items">
    <span class="label">label3</span>
    <div class="value">
      <a class="link">value3</a>
    </div>
  </div>

  <div class="items">
    <span class="label">label4</span>
    <div class="value">value4</div>
  </div>
</body>

Im trying to get text from <a class="link"> if possible or from <div class=value>.

for result in response.xpath("//div[@class='items']"):
    label = result.xpath(".//span[@class='label']//text()").extract_first()
    # here Im trying use or operation to get 
    # a text if possible or div text
    value = result.xpath(".//a[@class='link']//text()"
                         "|.//div[@class='value']//text()").get()
    print(label, value)

Results:

label1 value1
label2 
label3 
label4 value4

This code assign only text from <div class='value'> although <a class='link'> exist.

What I need?
I would like to xpath code return a text if possible in otherwise it should take div text.

Upvotes: 1

Views: 167

Answers (2)

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243509

Im trying to get text from <a class="link"> if possible or from <div> class=value>

Here is a simple / short XPath 1.0 expression that selects exactly all the wanted text nodes:

(//div[@class='value'] | //a[@class='link'])/text()

XSLT 1.0 - based verification:

This transformation:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

  <xsl:template match="/">
    <xsl:for-each select="(//div[@class='value'] | //a[@class='link'])/text()">
      <xsl:if test="not(position() = 1)">, </xsl:if>
      <xsl:copy-of select="."/>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

evaluates the XPath expression and outputs each selected text-node using convenient delimiters.

The wanted result is produced:

value1, value2, value3, value4

Upvotes: 0

supputuri
supputuri

Reputation: 14135

Here is the xpath that you should use.

//div[@class='items'][2]//div[@class='value']/a|//div[@class='items'][2]//div[@class='value'][not(a)]

So replace this in your code.

value = result.xpath(".//div[@class='value']/a/text()|.//div[@class='value'][not(a)]/text()").get()

Upvotes: 0

Related Questions