Reputation: 13
I need to construct a generic XPath to find the correct node, where the criteria are date and time. e.g. to find the node for "03 May", "12:17:44"
The XML has has a date and time tag. Inconveniently, the date tag is populated only for the first occurrence of the day. (The time tag is always populated)
I have tried this:
/itemisationTable/row[@Date="03 May"]/following-sibling::row[@Time="12:17:44"]
it works fine, but it is not correct, because this
/itemisationTable/row[@Date="03 May"]/following-sibling::row[@Time="21:12:06"]
also finds a result, which it should not.
My other problem is that
/itemisationTable/row[@Date="03 May"]/following-sibling::row[@Time="09:34:13"]
should find a node but it does not.
I would be greatful if someone could help me with the XPath for this, as it exceeds my Xpath skills.
Here is a snippet of the XML
<itemisationTable index="1" name="Belgium - SMS/Data" total="1.522">
<row Date="03 May" Time="09:34:13" Number="xphone.com" Description="Roaming Data" Origin="Belgium" Destination="Other Provider" InBundle="" Taxable="T" Duration="12.57 MB" Cost_exc_VAT="1.258" />
<row Date="" Time="10:43:41" Number="4428" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="10:43:44" Number="4428" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="12:17:44" Number="4408" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="21:10:50" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="21:11:55" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="04 May" Time="21:12:06" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="21:22:34" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="21:23:23" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="21:23:31" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="05 May" Time="21:23:56" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="21:30:45" Number="4412" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="22:24:35" Number="4431" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="" Time="22:24:38" Number="4431" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
<row Date="06 May" Time="" Number="xphone.com" Description="Roaming Data" Origin="Belgium" Destination="Other Provider" InBundle="" Taxable="T" Duration="2.59 MB" Cost_exc_VAT="0.264" />
<row Date="" Time="07:09:15" Number="4483" Description="xphone SMS" Origin="Belgium" Destination="UK" InBundle="B" Taxable="" Duration="1" Cost_exc_VAT="0.000" />
</itemisationTable>
Upvotes: 1
Views: 210
Reputation: 295619
Tricky! This isn't particularly efficient, but it is a working solution (with XPath 2.0):
/itemisationTable
/row[@Date=$date]
/(self::row | following-sibling::row[
not(./preceding-sibling::row[@Date != ""][1]/@Date != $date)
][not(./@Date != "" and ./@Date != $date)])[@Time=$time]
With XPath 1.0, this gets hairier:
/itemisationTable
/row[@Date=$date]
/following-sibling::row[
not(./preceding-sibling::row[@Date != ""][1]/@Date != $date)
][not(./@Date != "" and ./@Date != $date)][@Time=$time]
| /itemisationTable/row[@Date=$date][not(./@Date != "" and ./@Date !=$date][@Time=$time]
Set $date
and $time
through your XPath engine -- in XMLStarlet, for instance, this would be --var date='"03 May"'
; evaluating XPath in an XQuery engine it would be with declare variable $date="03 May"
; etc.
The important thing is to use preceding-sibling
to backtrack and look at whether you've crossed a boundary, ruling out any nodes for which that's true.
For a language with enough expressive power to build an efficient solution, I'd want to switch to XQuery.
To allow copy/paste testing, the below has been successfully used at http://www.freeformatter.com/xpath-tester.html:
/itemisationTable /row[@Date="03 May"] /following-sibling::row[ not(./preceding-sibling::row[@Date != ""][1]/@Date != "03 May") ][not(./@Date != "" and ./@Date != "03 May")][@Time="21:11:55"] | /itemisationTable/row[@Date="03 May"][not(./@Date != "" and ./@Date != "03 May")][@Time="21:11:55"]
Upvotes: 2