user2962546
user2962546

Reputation: 1

Counting commas in a sentence with XQuery

I'll preface this by saying I am a newbie at XQuery. That being said, I work on a project that uses XML to structure texts. So my document looks something like this :

<text>
 <paragraph>
   <sentence id="1"> This, is a sentence.</sentence>
   <sentence id="2"> This, is, a sentence.</sentence>
   <sentence id="3"> This, is, a, sentence.</sentence>
   <sentence id="4"> This is a sentence.</sentence>
 </paragraph>
</text>

I need to count the number of commas per sentence for a downstream linguistic analysis. I tried doing this :

let $comma := "&#44;"

for $arg in doc("document.xml")/text/paragraph/sentence

return count($arg//$comma)

I'm using Oxygen 14.0 and the XQuery editor is not giving me any syntax error messages. When I run it, I get a result, but one that is obviously false :

2 2 2 2

I modified the return line to this (since I don't understand the difference between // and / and wanted to try something) :

return count ($arg/$comma)

And now the result is :

1 1 1 1

Obviously, both of the results are false. There are many different sentences, with varying numbers of commas. I don't understand why it's giving those results. Please help?

Upvotes: 0

Views: 423

Answers (1)

wst
wst

Reputation: 11773

Appending a string to a location path does not execute a substring search for that string. However, functions can be used in XPath expressions to process strings.

One way to solve this is to use a comma as a delimiter param in fn:tokenize, and return one less that the number of tokens:

for $arg in doc("document.xml")/text/paragraph/sentence
return (count(tokenize($arg, ',')) - 1)

Upvotes: 2

Related Questions