Reputation: 127
hope someone can advise me here with my XPath query. I want to extract and display part of a string but the result that I am getting at the moment is returning the full string. I want to be able to get two results like the ones below.
<airline>
<flight-number flight_id="flt-888712-departure-date-arrival-arrival-date-0101">01</flight-number>
<flight-number flight_id="flt-888712-departure-date-arrival-arrival-date-0102">02</flight-number>
</airline>
This is the xml file flights.xml that I am working with.
<airline>
<flight-number flight_id="flt-888712-departure-date-arrival-date-0102">01–02</flight-number>
</airline>
For example, I just get 01-02 as my result when I try this XPath query below but more needs to be done to get what I stated above. I want the strings for 01 and 02 to be returned separately.
/airline/flight-number/child::text()
Can someone advise me on how to achieve the results with XPath for what I am trying to do?
Upvotes: 0
Views: 112
Reputation: 47089
Using xmlstarlet
and bash
:
alias xml=xmlstarlet
f1=$( xml sel -t -m 'airline/flight-number' -v 'substring-before(., "-")' infile.xml)
f2=$( xml sel -t -m 'airline/flight-number' -v 'substring-after (., "-")' infile.xml)
fid=$(xml sel -t -m 'airline/flight-number/@flight_id' \
-v 'substring(., 1, string-length(.)-2)' infile.xml)
xml ed -d airline/flight-number infile.xml |
xml ed -s airline -type elem -n flight-number -v $f1 |
xml ed -s airline -type elem -n flight-number -v $f2 |
xml ed -a 'airline/flight-number[1]' -t attr -n flight_id -v $fid$f1 |
xml ed -a 'airline/flight-number[2]' -t attr -n flight_id -v $fid$f2
Output:
<?xml version="1.0"?>
<airline>
<flight-number flight_id="flt-888712-departure-date-arrival-date-0101">01</flight-number>
<flight-number flight_id="flt-888712-departure-date-arrival-date-0102">02</flight-number>
</airline>
Upvotes: 1
Reputation: 1801
I want to be able to get two results like the ones below
With xmlstarlet, for example:
# shellcheck shell=sh disable=SC2016
xmlstarlet edit --omit-decl \
--var fn '//flight-number[@flight_id="flt-888712-departure-date-arrival-date-0102"]' \
-a '$fn' -t elem -n 'flight-number' \
-u '$prev' -x '$fn/node() | $fn/@*' \
-u '$prev/text()' -x 'substring-after(.,"–")' \
-u '$fn/text()' -x 'substring-before(.,"–")' \
file.xml
--var
keeps the relevant node in a variable named fn
-a …
appends an empty same-named sibling node$prev
(aka $xstar:prev
) variable refers to the node created
by the most recent -i (--insert)
, -a (--append)
, or -s (--subnode)
option; examples of $prev
are given in
doc/xmlstarlet.txt-u …
makes the new sibling a duplicate of $fn
, copying its
child and attribute nodes-u …
updates the text of the new sibling node-u …
updates the text of the original nodexmlstarlet edit
is
hereOutput:
<airline>
<flight-number flight_id="flt-888712-departure-date-arrival-date-0102">01</flight-number>
<flight-number flight_id="flt-888712-departure-date-arrival-date-0102">02</flight-number>
</airline>
UPDATE 2022-09-25
If input contains a series of flights then each can get a sibling node like this,
xmlstarlet edit --omit-decl \
--var fln '//flight-number' \
-a '$fln' -t elem -n 'flight-number' \
--var sib '$fln/following-sibling::*[position() mod 2 = 1]' \
-u '$sib' -x 'preceding-sibling::*[1]/node() | preceding-sibling::*[1]/@*' \
-u '$fln/text()' -x 'substring-before(.,"-")' \
-u '$sib/text()' -x 'substring-after(.,"-")' \
file.xml
where variable sib
references the siblings that are interleaved.
Sample output:
<airline>
<flight-number flight_id="flt-888712-0102">01</flight-number>
<flight-number flight_id="flt-888712-0102">02</flight-number>
<flight-number flight_id="flt-123456-0910">09</flight-number>
<flight-number flight_id="flt-123456-0910">10</flight-number>
<flight-number flight_id="flt-789012-3031">30</flight-number>
<flight-number flight_id="flt-789012-3031">31</flight-number>
</airline>
(end update)
Upvotes: 2