AndrewSit
AndrewSit

Reputation: 13

Xpath, find node between two other nodes

I need to construct a generic XPath to find the correct node, where the criteria are date and time. e.g. to find the node for "03 May", "12:17:44"

The XML has has a date and time tag. Inconveniently, the date tag is populated only for the first occurrence of the day. (The time tag is always populated)

I have tried this:

/itemisationTable/row[@Date="03 May"]/following-sibling::row[@Time="12:17:44"]

it works fine, but it is not correct, because this

/itemisationTable/row[@Date="03 May"]/following-sibling::row[@Time="21:12:06"]

also finds a result, which it should not.


My other problem is that

/itemisationTable/row[@Date="03 May"]/following-sibling::row[@Time="09:34:13"]

should find a node but it does not.


I would be greatful if someone could help me with the XPath for this, as it exceeds my Xpath skills.

Here is a snippet of the XML

<itemisationTable index="1" name="Belgium - SMS/Data" total="1.522">
<row Date="03 May" Time="09:34:13" Number="xphone.com" Description="Roaming Data" Origin="Belgium" Destination="Other Provider" InBundle="" Taxable="T" Duration="12.57 MB" Cost_exc_VAT="1.258" />
<row Date="" Time="10:43:41" Number="4428" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="10:43:44" Number="4428" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="12:17:44" Number="4408" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="21:10:50" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="21:11:55" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="04 May" Time="21:12:06" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="21:22:34" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="21:23:23" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="21:23:31" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="05 May" Time="21:23:56" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="21:30:45" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="22:24:35" Number="4431" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="22:24:38" Number="4431" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="06 May" Time="" Number="xphone.com" Description="Roaming Data" Origin="Belgium" Destination="Other Provider" InBundle="" Taxable="T" Duration="2.59 MB" Cost_exc_VAT="0.264" />
<row Date="" Time="07:09:15" Number="4483" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
</itemisationTable>

Upvotes: 1

Views: 210

Answers (1)

Charles Duffy
Charles Duffy

Reputation: 295619

Tricky! This isn't particularly efficient, but it is a working solution (with XPath 2.0):

/itemisationTable
  /row[@Date=$date]
  /(self::row | following-sibling::row[
    not(./preceding-sibling::row[@Date != ""][1]/@Date != $date)
  ][not(./@Date != "" and ./@Date != $date)])[@Time=$time]

With XPath 1.0, this gets hairier:

/itemisationTable
  /row[@Date=$date]
  /following-sibling::row[
    not(./preceding-sibling::row[@Date != ""][1]/@Date != $date)
  ][not(./@Date != "" and ./@Date != $date)][@Time=$time]
| /itemisationTable/row[@Date=$date][not(./@Date != "" and ./@Date !=$date][@Time=$time]

Set $date and $time through your XPath engine -- in XMLStarlet, for instance, this would be --var date='"03 May"'; evaluating XPath in an XQuery engine it would be with declare variable $date="03 May"; etc.

The important thing is to use preceding-sibling to backtrack and look at whether you've crossed a boundary, ruling out any nodes for which that's true.

For a language with enough expressive power to build an efficient solution, I'd want to switch to XQuery.


To allow copy/paste testing, the below has been successfully used at http://www.freeformatter.com/xpath-tester.html:

/itemisationTable   /row[@Date="03 May"]   /following-sibling::row[     not(./preceding-sibling::row[@Date != ""][1]/@Date != "03 May")   ][not(./@Date != "" and ./@Date != "03 May")][@Time="21:11:55"] | /itemisationTable/row[@Date="03 May"][not(./@Date != "" and ./@Date != "03 May")][@Time="21:11:55"]

Upvotes: 2

Related Questions