Reputation: 391
I have a following XML:
<doc>
<ActivityNarrativeInformation>
<ActivityID>123456789</ActivityID>
<ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>1</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>She Sells Sea Shells by the Sea Shore and she also</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
<ActivityID>123456789</ActivityID>
<ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>3</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>triple shot frappuccino, extra hot, with whipped cream in a tall cup </ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
<ActivityID>123456789</ActivityID>
<ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>2</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>likes to take long walks on the beach while she drinks a</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
<ActivityID>987654321</ActivityID>
<ActivityNarrativeInformationID>222222222</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>486</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>It was a dark and stormy night; the rain fell in torrents--except at occasional intervals, when
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
<ActivityID>987654321</ActivityID>
<ActivityNarrativeInformationID>222222222</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>488</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>scene lies), rattling along the housetops, and fiercely agitating the scanty flame of the lamps that
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
<ActivityID>987654321</ActivityID>
<ActivityNarrativeInformationID>222222222</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>487</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>was checked by a violent gust of wind which swept up the streets (for it is in London that our
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
<ActivityID>987654321</ActivityID>
<ActivityNarrativeInformationID>222222222</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>489</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>struggled against the darkness.
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
<ActivityID>55555555</ActivityID>
<ActivityNarrativeInformationID>77777777</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>31921</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>Papa Bear was very big and growly. Mamma Bear was middle-sized and pleasant.
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
<ActivityID>55555555</ActivityID>
<ActivityNarrativeInformationID>77777777</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>31923</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>Papa bear loved to fix things around the house; Mama bear loved to grow flowers in her garden; and, Baby bear loved playing in the yard. They were very happy. </ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
<ActivityID>55555555</ActivityID>
<ActivityNarrativeInformationID>77777777</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>31920</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>Once upon a time there were three bears, Papa Bear, Mamma Bear and Baby Bear
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
<ActivityID>55555555</ActivityID>
<ActivityNarrativeInformationID>77777777</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>31922</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>And Baby Bear, well, he was small, and
sometimes he squeaked! They lived in a pretty little house on the edge of the forest
</ActivityNarrativeText>
</ActivityNarrativeInformation>
</doc
I need to group ActivityNarrativeInformation elements by ActivityID
and concatenate ActivityNarrativeText
in such a way, that it is sorted by ActivityNarrativeSequenceNumber
I managed to sort elements with following XPath query (XPath 3.1)
sort(//ActivityNarrativeInformation[ActivityID=123456789], (), function($ActivityNarrativeSequenceNumber) {$ActivityNarrativeSequenceNumber})
So the result looks like this:
<ActivityNarrativeInformation>
<ActivityID>123456789</ActivityID>
<ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>1</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>She Sells Sea Shells by the Sea Shore and she also</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
<ActivityID>123456789</ActivityID>
<ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>2</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>likes to take long walks on the beach while she drinks a</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
<ActivityID>123456789</ActivityID>
<ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
<ActivityNarrativeSequenceNumber>3</ActivityNarrativeSequenceNumber>
<ActivityNarrativeText>triple shot frappuccino, extra hot, with whipped cream in a tall cup </ActivityNarrativeText>
</ActivityNarrativeInformation>
The probelm however is, that if I want to limit down above to just all ActivityNarrativeText
by adding /ActivityNarrativeText
at the end like this
sort(//ActivityNarrativeInformation[ActivityID=123456789], (), function($ActivityNarrativeSequenceNumber) {$ActivityNarrativeSequenceNumber})/ActivityNarrativeText
or
sort(//ActivityNarrativeInformation[ActivityID=123456789]/ActivityNarrativeText, (), function($seq) {$seq/ActivityNarrativeSequenceNumber})
The order is lost:
<ActivityNarrativeText>She Sells Sea Shells by the Sea Shore and she also</ActivityNarrativeText>
<ActivityNarrativeText>triple shot frappuccino, extra hot, with whipped cream in a tall cup </ActivityNarrativeText>
<ActivityNarrativeText>likes to take long walks on the beach while she drinks a</ActivityNarrativeText>
What am I doing wrong?
Upvotes: 0
Views: 106
Reputation: 3423
Testing it here: videlibri.de/cgi-bin/xidelcgi
If you're using xidel, then please add its tag. And maybe cmd for Windows, or bash for Unix as well.
I'm not too sure this can be done with XPath. I believe you're better off using XQuery.
For the narrative with <ActivityID>123456789</ActivityID>
you could do:
$ xidel -s input.xml --xquery '
normalize-space(
for $x in //ActivityNarrativeInformation
where $x/ActivityID = 123456789
order by $x/ActivityNarrativeSequenceNumber
return
$x/ActivityNarrativeText
)
'
For all narratives I'd suggest:
$ xidel -s input.xml --xquery '
for $narrative at $i in //ActivityNarrativeInformation
group by $id:=$narrative/ActivityID
count $i
return (
$i,
normalize-space(
for $seq in $narrative
order by $seq/ActivityNarrativeSequenceNumber
return
$seq/ActivityNarrativeText
)
)
'
1
Once upon a time there were three bears, [...]
2
She Sells Sea Shells by the Sea Shore and [...]
3
It was a dark and stormy night; the rain [...]
Group by <ActivityID>
first, then in another for-loop order the sentences by <ActivityNarrativeSequenceNumber>
.
Update 2021-07-05; I forgot about XPath's !
. In that case one for-loop is enough:
$ xidel -s input.xml --xquery '
for $narrative at $i in //ActivityNarrativeInformation
order by $narrative/ActivityNarrativeSequenceNumber
group by $id:=$narrative/ActivityID
count $i
return (
$i,
normalize-space($narrative ! ActivityNarrativeText)
)
'
Upvotes: 1
Reputation: 167581
In addition to the right answer not to use /
after sorting but !
instead, one of your attempts would actually work if your sort function argument selected the right element as the sort key:
sort(//ActivityNarrativeInformation[ActivityID=123456789]/ActivityNarrativeText, (), function($text) {$text/../ActivityNarrativeSequenceNumber})
Upvotes: 1
Reputation: 16917
You lose the order when you write /ActivityNarrativeText
, and it returns the <ActivityNarrativeText>
in the same order they have in the input file
/something
with nodes does not just mean map it to the child.
It means
Map it
Reorder all nodes to the input document order
Remove duplicates
You could use !ActivityNarrativeText
Upvotes: 2
Reputation: 24930
If what you want to do is extract a coherenet sentece from your sample xml from that particular ActivityID
, this expression
string-join(sort(//ActivityNarrativeInformation[ActivityID=123456789]/ActivityNarrativeText/concat(normalize-space()," "), (), function($ActivityNarrativeSequenceNumber) {$ActivityNarrativeSequenceNumber}))
should output
She Sells Sea Shells by the Sea Shore and she also likes to take long walks on the beach while she drinks a triple shot frappuccino, extra hot, with whipped cream in a tall cup
Upvotes: 1