Anuvrat Parashar
Anuvrat Parashar

Reputation: 3100

xpath: How to write conditional xpaths?

I am trying to extract price information from following two pages:

http://jujumarts.com/mobiles-accessories-smartphones-wildfire-sdarkgrey-p-551.html http://jujumarts.com/computers-accessories-transcend-500gb-portable-storejet-25d2-p-2616.html

xpath1 = //span[@class='productSpecialPrice']//text()
xpath2 = //div[@class='proDetPrice']//text()

As of now I have written python code, which returns the result of xpath1 if it is successful otherwise executes the second one. I have a feeling that it is possible to implement this logic in xpath alone, can someone tell me how?

Upvotes: 0

Views: 1628

Answers (1)

unutbu
unutbu

Reputation: 879073

Use | to indicate union:

xpath3 = "//span[@class='productSpecialPrice']//text()|//div[@class='proDetPrice']//text()"

This is not exactly what you asked for, but I think it could be incorporated in a workable solution.


From the XPath (version 1.0) specs:

The | operator computes the union of its operands, which must be node-sets.


For example,

import lxml.html as LH

urls = [
    'http://jujumarts.com/mobiles-accessories-smartphones-wildfire-sdarkgrey-p-551.html',
    'http://jujumarts.com/computers-accessories-transcend-500gb-portable-storejet-25d2-p-2616.html'
    ]

xpaths = [
    "//span[@class='productSpecialPrice']//text()",
    "//div[@class='proDetPrice']//text()",
    "//span[@class='productSpecialPrice']//text()|//div[@class='proDetPrice']//text()"
    ]
for url in urls:
    doc = LH.parse(url)
    for xpath in xpaths:
        print(doc.xpath(xpath))
    print

yields

['Rs.11,800.00']
['Rs.13,299.00', 'Rs.11,800.00']
['Rs.13,299.00', 'Rs.11,800.00']

[]
['Rs.7,000.00']
['Rs.7,000.00']

Another way to get at the information you want is

"//*[@class='productSpecialPrice' or @class='proDetPrice']//text()" 

Upvotes: 4

Related Questions