XQuery - Why there is difference in result?

Question


 
   Electromagnetic Fields
    
      
      2006
    
    
      
        blah blah blah.blh blah blah.
      
    
    
      
        blah blah blah.blah blah blah.
        blah blah blah.blah blah blah.
        blah blah blah.emf waves blah.
        blah blah blah.emf waves blah.
        blah blah blah.emf waves blah.
        blah waves blah.emf waves blah.
        emf blah blah.waves blah.
        blah blah blah.emf waves blah.
        blah blah blah.emf waves blah.

Query 1 -

for $x in ft:search("Article", ("emf","waves"), map{'mode':='all words'})/ancestor::*:Doc
  return $x/Title

I am getting 62 Hits

Query 2 -

for $x in ft:search("Article", ("emf","waves"), map{'mode':='all words'})
  return $x/ancestor::*:Doc/Title

I am getting 159 Hits

Query 3 -

for $x in doc("Article")/Doc[Info[Vol/@name="Physics" and Year ge "2006" and Year le "2010"]]
[SD/Info/Para/text() contains text {"emf","waves"} all words or
SD/Info/Para/P/text() contains text {"emf","waves"} all words or
LD/Info/Para/text() contains text {"emf","waves"} all words or
SD/Info/Para/P/text() contains text {"emf","waves"} all words or
SD/Info/Para/P/B/text() contains text {"emf","waves"} all words or
SD/Info/Para/P/I/text() contains text {"emf","waves"} all words or
SD/Info/Para/P/U/text() contains text {"emf","waves"} all words]
    return $x/Title

This results in 224 hits. In the 3rd one, I am using all the nodes which are actually present. I, B and U are for Italic, Bold and Underline the text.

Why this difference ?

Gunther · Accepted Answer

Queries 1 and 2 pretty much look the same, however the path expression in Q1 results in Doc elements. So if there are multiple matching nodes below a single Doc, that Doc will count just once in Q1, whereas each node is counted individually in Q2. This is due to the fact that the node sequence resulting from a path expression, by definition, is duplicate-free.

Q3 is different, but while Q1 and Q2 depend on the properties of a full-text index, Q3 doesn't. If e.g. the index is case-sensitive, you'll get less results from it than from a contains text predicate.

So from the quoted counts, I'd assume that the text index comes up with 159 matching nodes in 62 documents, while being specified as more restrictive than a plain contains text.

XQuery - Why there is difference in result?

Answers (2)

Related Questions