Reputation: 271
Given a set of xml records and a set of terms $terms
. The terms in $terms
sequence are extracted from the set of records. I want to count the number of occurrences of each term in each paragraph record. I used the following code to do so:
for $record in /rec:Record
for $term in $terms
return xdmp:unquote(concat('<info>',string(count(lower-case($record/rec:paragraph )[. = lower-case($term)])), '</info>'))
For each term in each record i got 0 count:
Example: $term:='Mathematics'
, $record/rec:paragraph:='Mathematics is the study of topics such as quantity'
I want the number of occurances of the term Mathematics in $record/rec:paragraph
Any idea of what caused this result? Is there any other way to count the number of occurrences of each of the terms in each paragraph.
Upvotes: 2
Views: 584
Reputation: 758
Use tokenize() to split up the input string into word tokens. Then the counting itself is trivial. For example:
let $text := 'Mathematics is the study of topics such as quantity'
let $myterms := 'mathematics'
let $wds := tokenize($text, '\s+')
for $t in $myterms
return <term name="{$t}">{count($wds[lower-case(.)=lower-case($t)])}</term>
Returns this:
<term nm="mathematics">1</term>
Upvotes: 2