haggis78
haggis78

Reputation: 73

How to find element with most common attribute value using only XPath 3.1?

I have an XML file of a play, from which this extract will serve:

<play>
    <speech>
        <spkr name="CAL">CAL.</spkr>Et tu mecastor salve, Lysistrata. Sed quid conturbata es?
        exporge frontem, carissima: non enim te decent contracta supercilia.</speech>
    <speech>
        <spkr name="LYS">LYS.</spkr>Sed, ô Calonice, uritur mihi cor, et valde me piget sexus
        nostri, quoniam viri existimant<endnote orig="transcriber" n="1"/> nos esse nequam.</speech>
    <speech>
        <spkr name="CAL">CAL.</spkr>Quippe tales pol sumus.</speech>
</play>

I'm trying to find an XPath (3.1, run in oXygen) solution to the question:

What character speaks the most frequently in this play? (I.e. to return, for this sample, the attribute value CAL .)

I've tried various ways of combining the functions distinct-values(), count(), and max(), and have worked through the articles here on Stack Overflow about saxon:highest(), but I can't seem to get it to work where the number of which I'm trying to get the max() is a number of counted same values from the distinct-values() of attribute values.

I could find an XQuery answer, where I run a for-loop and order it by the count and then tell it to return only the first one on the list, but surely there must be a reasonably elegant XPath answer. This would also enable me to transfer the answer to XSLT when needed.

Upvotes: 0

Views: 50

Answers (2)

Michael Kay
Michael Kay

Reputation: 163468

Grouping queries are generally easier in XSLT or XQuery rather than in XPath. But in 4.0 you can build a map of speakers / number of speeches with

let $freq = map:build(//spkr, fn{@name}, fn{1}, op('+'))

and then get the highest with

return highest(map:pairs($freq), fn{?value})?key

Not tested. And yes, I know, you wanted an XPath 3.1 solution, presumably without using any Saxon extensions. In 3.1 we can build the histogram with

let $freq := for $n in distinct-values(//spkr/@name)
             return count(//spkr[@name = $n])

then we can find the highest count with

let $max := max($freq?*)

and then we can find the name having that count with

return map:keys($freq)[map:get(.) = $max]

Not pretty, but should work. Certainly justifies some of the new 4.0 functionality!

Upvotes: 2

Martin Honnen
Martin Honnen

Reputation: 167696

With pure XPath 3.1 (example Saxon 12 HE fiddle)

map:merge(//spkr ! map:entry(string(@name), .), map { 'duplicates' : 'combine'}) => map:for-each(function($k, $v) { map:entry($k, -count($v)) }) => sort((), function($e) { $e?* }) => head() => map:keys()

With XPath 4 (available with Saxon 12 EE or PE in oXygen but not sure how to force XPath version 4 in the settings, also BaseX fiddle):

map:merge(//spkr ! map:entry(string(@name), .), map { 'duplicates' : 'combine'}) => map:for-each(function($k, $v) { map:entry($k, count($v)) }) => highest((), function($e) { $e?* } ) => map:keys()

Upvotes: 2

Related Questions